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1.0  SUMMARY 


In  this  project,  several  fundamental  research  topics  have  been  carried  out  for  developing  multiagent 
task  coordination  strategies  under  a  distributed  optimization  framework.  The  proposed  subjects  are 
critical  to  the  development  of  engineered  multiagent  systems  such  as  robotic  networks,  sensor  net¬ 
works,  and  computer  networks,  and  they  are  important  to  both  military  and  civilian  applications. 
The  objectives  of  the  proposed  research  are  three-folds:  perform  systematic  controllability  analy¬ 
sis  for  multiagent  networks  which  may  have  nonlinear  dynamics,  design  distributed  optimal  and 
adaptive  coordination  protocols  in  the  presence  of  various  model  and  communication  uncertain¬ 
ties,  and  conduct  computer  simulation  and  experimental  validation  of  the  proposed  designs  using 
mobile  robotic  platforms. 

The  project  renders  novel  methods  for  discontinuous  cooperative  control  under  least  restrictive 
sensing/communications  and  communication  delays,  approximate  dynamic  programming  based 
optimal  cooperative  control,  adaptive  cooperative  control  of  uncertain  multiagent  systems,  and 
distributed  formation  control  and  coverage  controls  of  multiple  mobile  robots  with  kinematic  con¬ 
straints.  In  particular,  the  following  three  sets  of  results  have  been  obtained: 

•  By  analyzing  the  least  restrictive  condition  for  sensing/communication  among  multiagents, 
it  is  revealed  that  network  consensus  may  not  be  achieved  in  the  presence  of  discontinuous 
system  dynamics.  To  address  the  issue,  a  new  discontinuous  cooperative  law  was  proposed 
to  achieve  the  task  coordination  of  multiagent  systems  under  directed  and  switching  sens¬ 
ing/communication  topologies  dm.  In  particular,  we  have  shown  the  resilience  of  the  pro¬ 
posed  nominal  cooperative  control  to  certain  extent  in  terms  of  communication  link  failures 
and  time-delays. 

•  The  optimal  and  adaptive  task  coordination  for  multiagent  systems  was  thoroughly  studied 
from  three  aspects.  First,  a  practically  implementable  valued  function  approximation-based 
multiagent  policy  iteration  algorithm  was  proposed  for  the  optimal  cooperative  control  of  a 
class  of  nonlinear  systems  lOTI.  In  the  design,  system  behaviors  are  quantized  using  indi¬ 
vidual  cost  functions  in  order  to  direct  the  optimal  operation  of  multiagent  systems.  Second, 
to  further  relax  condition  for  the  requirement  of  system  dynamics,  approximate  Q  function- 
based  multiagent  coordination  algorithm  was  proposed.  Third,  the  adaptive  coordination  of 
multiagent  systems  with  uncertainties  was  studied.  A  new  distributed  adaptive  cooperative 
control  was  proposed  to  deal  with  model  uncertainties  using  neural  network  approximation 
and  adaptive  estimation  of  unknown  parameters. 

•  Simulation  and  experimental  study  was  conducted  to  test  the  robustness  of  the  proposed 
coordination  controls  for  multiagents  based  on  case  studies  for  formation  control  and  cov¬ 
erage  control  of  multiple  mobile  robots  [|271  [H  [30l|.  Specifically,  for  formation  control  of 
multiple  mobile  robots,  both  linearization-based  design  and  nonlinear  model-based  design 
were  proposed  by  assuming  that  only  limited  information  of  a  desired  trajectory  is  available. 
For  coverage  control,  we  proposed  a  distributed  deployment  algorithm  for  mobile  robots  to 
cover  a  convex  region.  The  proposed  deployment  algorithm  iteratively  updates  the  Voronoi 
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partition  through  local  information  exchange,  and  then  moves  toward  its  eentroid  based  on 
eentroid-driven  eontrol  algorithms. 

Teehnieal  diseussions  of  these  three  topies  and  researeh  results  are  provided  in  the  following  see- 
tions. 


2.0  INTRODUCTION 

Multiagent  systems  are  generieally  defined  as  a  group  of  dynamieal  systems  in  whieh  eertain  emer¬ 
gent  behaviors  are  exhibited  through  loeal  interaetions  of  group  members  that  individually  have 
the  eapability  of  self-operating  ifT^  [|T^  ll2l  [[T9l .  The  fundamental  issues  in  the  study  of  multiagent 
systems  are  the  analysis  of  network  eontrollability  and  the  design  of  eoordination  eontrol  protoeol 
in  order  to  aehieve  autonomous  and  optimal  tasking  alloeation.  For  network  eontrollability,  the 
objeetive  is  to  figure  out  the  eonneetivity  eonditions  on  sensor/eommunieation  topologies  among 
agents  (ineluding  human  operators)  to  aehieve  desired  behaviors.  Reeent  results  on  eonneetiv- 
ity  eonditions  for  multiagent  systems  mostly  assume  perfeet  eommunieation  eonditions,  while  in 
praetiee  there  often  exist  eommunieation  uneertainties  and  bandwidth  limitations.  For  eoordina- 
tion  eontrol  protoeol  design,  the  objeetive  is  to  develop  the  proper  eontrol  protoeol  to  perform  the 
eoordinated  tasks.  The  existing  results  in  literature  may  not  be  direetly  applieable  to  multiagent 
task  alloeation  due  to  possible  link  errors,  long  eommunieation  delays,  and  system  uneertainties. 
More  importantly,  eonsidering  the  possible  dynamie  task  evolution  for  multiagent  networks,  the 
individual  agent  may  exhibit  multi-modal  dynamies  under  different  running  eireumstanees  or  due 
to  uneertainties  and  disturbanees.  All  of  these  pose  ehallenges  in  the  design  of  performanee  guar¬ 
anteed  distributed  eoordination  protoeols  that  explieitly  take  into  eonsideration  system  dynamies 
and  uneertainties. 

The  proposed  researeh  have  been  eentered  on  addressing  the  aforementioned  fundamental  issues 
by  targeting  the  following  objeetives:  perform  systematie  eontrollability  analysis  for  multiagent 
networks  whieh  may  have  nonlinear  dynamies,  design  distributed  optimal  and  adaptive  eoordi¬ 
nation  protoeols  in  the  presenee  of  various  model  and  eommunieation  uneertainties,  and  eonduet 
eomputer  simulation  and  experimental  validation  of  the  proposed  designs  using  mobile  robotie 
platforms. 

The  projeet  has  rendered  several  signifieant  results  in  the  development  of  distributed  optimal,  adap¬ 
tive  and  robust  eooperative  eontrol  protoeols  for  the  task  eoordination  of  multiagent  systems.  These 
results  have  reported  and  published  in  a  number  of  IEEE  eonferenees  |l2^[3TJ|^[8l|33,  and  jour¬ 
nal  versions  are  under  preparation  for  submission.  The  overall  eontributions  of  this  projeet  lie  in 
two  aspeets:  1)  the  learning  approaehes  borrowed  from  rieh  results  in  artifieial  intelligenee  re¬ 
seareh  are  effeetively  integrated  with  the  rigorous  eontrol  systems  analysis  tools,  and  produeed 
novel  approximate  dynamie  programming  based  optimal  eooperative  eontrol  and  adaptive  eooper¬ 
ative  eontrol  for  uneertain  multiagent  networks;  2)  the  researeh  outeome  on  new  task  eoordination 
algorithms  for  multiple  agents  operating  in  eomplex  environments  are  a  manifestation  of  robust 
intelligenee. 
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The  rest  of  the  report  is  organized  as  follows.  Section  3.0  presents  the  basic  methods,  assumptions 
and  procedures  in  this  research,  and  formulates  the  general  multiagent  coordination  problem.  Sec¬ 
tions  4.0,  5.0,  and  6.0  present  the  technical  results  and  discussions  on  three  sets  of  results  in  term 
of  optimal  and  adaptive  cooperative  control  design  and  applications,  respectively.  In  each  section, 
simulation  and  experimental  results  are  given  to  illustrate  the  effectiveness  of  the  proposed  designs. 
Section  7.0  concludes  the  report. 


3.0  METHODS,  ASSUMPTIONS,  AND  PROCEDURES 
3.1  Multiagent  Dynamics  and  Assumptions 

This  research  builds  upon  rigorous  methods  ranging  from  systems  and  controls  theory,  distributed 
reinforcement  learning,  adaptive  learning  to  neural  network  approximation.  We  consider  a  set  of 
agents  Q  =  {1,  ■  ■  ■  ,  N},  where  N  is  the  number  of  agents  in  the  group  and  assume  that  each  agent 
evolves  according  to  the  general  system  dynamics  described  by 

f  +Wi{t), 

\  yi{t)  =  +  Vi{t) 

where  i  =  1,  •  •  ■  ,  A^,  G  3?"*  is  the  state  vector,  m,  e  3?™*  is  the  input  vector,  m*  <  n*,  i/i  G 
is  the  output  (measurement)  vector,  Wi{t)  G  3?"*  and  Vi{t)  G  3?^*  are  Gaussian  noises  with  zero 
mean,  /*(■)  and  hi{-)  are  piecewise  continuous  vector- valued  functions  of  on  3?"'*,  and  A/j(-)  and 
Agi{-)  denote  model  uncertainties.  Agent  dynamics  considered  in  Q  are  given  by  the  first-order 
differential  equations  in  continuous  time.  The  analog  of  Q  in  discrete  time  can  be  defined  by  a 
system  of  first-order  discrete  time  equations  of  the  following  form 

(  +  =  fi{^i{k),Ui{k),k)  +  Afi{^i{k),k) +Wi{k), 

\  yi{k)  =  hi{^i{k),^j{k)) +Vi{k) 

where  k  G  {0, 1, 2,  •  •  •  }  is  the  discrete  time  index.  In  this  project,  we  deal  with  the  general  class 
of  multiagent  dynamical  systems,  and  the  agent  dynamics  may  assume  either  the  continuous-time 
model  in  ([T])  or  the  discrete-time  model  in  Q. 

To  achieve  multiagent  systems  coordination,  it  is  necessary  that  the  agents  in  the  group  are  capable 
of  exchanging  information  through  the  sensing/communication  networks.  For  agent  i,  its  output 
and  measurement  vector  yi  reflects  its  interaction  with  other  agents  in  the  group  through  commu¬ 
nication/sensor  channels.  In  addition,  we  define  a  coordination  variable  Xi  =  Xiiyi)  to  generically 
describe  the  coordination  tasks  for  multiagent  systems,  where  Xi  ■  3?*^  is  a  continuous 

and  differentiable  function  of  yt.  By  introducing  Xi,  various  coordination  tasks  such  as  consensus, 
rendezvous,  cooperative  target  localization,  mobile  agents  coverage  control,  distributed  resource 
allocation,  and  formation  control  may  be  embedded  into  the  definition  of  function  Xi{yi)-  To  this 
end,  the  multiagent  task  coordination  to  be  addressed  in  this  project  can  be  recast  as  cooperative 
stability  issues  as  defined  below. 
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Definition  1  Multiagent  systems  Q  or  (|^  are  said  to  be  cooperative  i/limt_>.oo  [xi{t)  —  Xj{t)]  = 
IgO,  where  Iq  is  q— dimensional  column  vector  with  all  its  elements  being  1.  Multiagent  systems 
a  or  ^  are  said  to  be  cooperatively  stable  (i.e.,  cooperative  and  all  the  state  variables  of  the 
systems  are  uniformly  bounded)  if  for  some  steady  state  a;*®  G  liiiii^oo  xfi)  =  x^^. 

As  seen  in  definition  the  steady  state  represents  the  convergence  value  of  the  coordination 
variables  xft)  for  all  agents  in  the  group.  For  example,  if  the  coordination  task  for  mulitagent 
systems  ([^  is  to  seek  the  average  consensus,  then  /-^• 

Now,  let  us  define  the  objective  function  Ui{x,  Ui,  f)  for  agent  i  to  accommodate  the  optimal  per¬ 
formance  of  the  multiagent  systems  coordination.  Let  the  objective  function  Ufx,  Ui,t)  be 

/OO 

Li(x(r),Mi(r))dr,  (3) 

where  x  =  [x'l,  •  •  •  ,  x'^Y  the  stacked  overall  coordination  variable,  Lf-)  and  ff)  are  the 

running  cost  functions.  To  this  end,  the  multiagent  systems  task  coordination  problem  studied  in 
this  project  can  be  generically  described  as  follows. 

Problem  1  For  a  network  of  mulitagent  dynamical  systems  O  or  (0.  design  cooperatively  stabi¬ 
lizing  control  protocols  uft)  of  the  form 

Ui{t)  =  ai{xi,Xj^,  -  ■  ■  ,Xj^,t),  (4) 

while  solving  the  following  optimization  problem 

n 

min  Ujjx)  (5) 

i=l 

where  Xj^,jk  G  A/i  are  the  coordination  variables  of  the  neighboring  agents  of  agent  i,  Mi  is  the 
index  set  of  the  neighboring  agents  of  agent  i. 


3.2  Sensing/Communication  Model  and  Procedures 


The  success  of  solving  coordination  Problem  [J  is  dependent  on  information  exchange  among 
agents.  In  general,  we  assume  that  the  information  exchange  among  agents  are  done  through  com¬ 
munication  broadcasting  or  agents’  sensing  capabilities.  We  consider  flexible  time-varying  sens¬ 
ing/communication  topologies  among  agents.  To  precisely  account  for  the  sensing/communication 
information  exchange  among  agents  in  the  design  of  coordination  strategy  and  control  protocols, 
we  introduce  the  following  sensing  /communication  matrix  and  its  corresponding  time  sequence 
{tl  :  A;  =  0, 1,  •  •  ■  }.  That  is,  within  time  interval  [/^,  the  sensing/communication  topology 
is  assumed  to  be  unchanged. 


S{t) 


Sll 

Sl2(/)  ■  ■ 

■  Slq{t) 

S2l(/) 

-S22 

■  S2q{t) 

SglW 

Sq2{t)  ■  ■ 

Sqq 

(6) 
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with  S{t)  =  E  where  Sa  =  1;  Sij{t)  =  1  if  the  jth  agent  is  in  the  sen- 

sor/eommunieation  range  of  the  ith  agent  at  time  t,  and  Sij  =  0  otherwise;  and  =  to.  It  ean 
be  assumed  without  loss  of  generality  that  0  <  <  Q  <  oo,  where  Cj  and  q  are  eon- 

stant  bounds.  In  the  presenee  of  oommunieation  delay  r,  the  available  information  at  time  instant 
t  will  depend  on  —  r). 

In  the  following  seetions,  we  report  several  multiagent  eoordination  eontrol  algorithms  in  solving 
Problem [J by  foeusing  on  proeedures  of  dealing  with  the  following  key  elements. 

•  First,  a  sensing/eommunieation  model  is  fundamental  to  deseribe  the  information  exehange 
among  multiple  agents  in  the  system.  One  of  the  main  objeetives  of  this  projeet  is  to  establish 
the  least  restrietive  network  eontrollability  eondition  for  multiagent  systems  to  aehieve  the 
task  eoordination. 

•  Seeond,  in  order  to  eover  a  broad  elass  of  praetieal  applieations  for  multiple  agents,  multi¬ 
agent  dynamies  are  of  paramount  importanee  in  the  eoordination  tasks.  We  design  optimal 
and  adaptive  eoordination  eontrols  for  a  general  elass  of  dynamieal  systems  with  uneertain- 
ties. 

•  Third,  multiagent  task  eoordination  applieations  are  eondueted  by  partieularly  solving  the 
formation  eontrol  and  eoverage  eontrol  problems  for  multiple  mobile  robots  with  kinematie 
eonstraints. 

•  Forth,  extensive  eomputer  simulation  and  experimental  tests  have  been  performed  to  illus¬ 
trate  the  proposed  designs. 


4.0  RESULTS  AND  DISCUSSION:  NETWORK  CONTROLLABILITY 
ANALYSIS 

In  this  seetion,  we  report  the  results  on  network  eontrollability  analysis  and  present  a  new  dis- 
eontinuous  eooperative  eontrol  for  eonsensus  of  multiagent  systems  under  direeted  and  switehing 
sensing/eommunieation  topologies  and  time-delays.  Simulation  test  results  for  underwater  sonar 
data  transmission  are  also  given. 

4.1  Sequential  Completeness  Condition  on  Network  Controllability 

One  of  the  key  issues  in  engineered  multiagent  systems  is  the  study  of  network  eontrollability. 
The  objeetive  is  to  figure  out  the  eonneetivity  eonditions  on  sensor/eommunieation  topologies  of 
the  network  for  aehieving  eonsensus  behavior.  In  iTTlllSOl.  the  eondition  is  obtained  for  eomposite 
undireeted  graphs  whieh  need  to  be  eonneeted.  Extensions  were  made  in  lfT7ll[|9ll  to  the  ease  with  di¬ 
reeted  graphs,  and  the  less  restrietive  eonditions  are  stated  as  that  there  exists  a  spanning  tree  or  the 
network  is  strongly  eonneeted  periodieally.  Complement  to  the  aforementioned  graph-theoretieal 
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methods,  a  matrix-theoretical  framework  is  developed  in  iTT^  to  deal  with  the  high-order  systems 
with  arbitrary  but  finite  relative  degrees.  The  notion  of  sequentially  completeness  was  introduced 
in  ifT^  ll29l  to  describe  the  least  required  condition  on  network  connectivity  for  cooperative  control 
design,  which  is  restated  by  the  following  definitions. 

Definition  2  Sensing/communication  matrix  sequence  {*S'(f)}  is  said  to  be  sequentially  lower- 
triangularly  complete  if  it  is  sequentially  lower-triangular  and  in  every  row  i  of  its  lower  triangular 
canonical  form,  there  is  at  lease  one  j  <  i  such  that  the  corresponding  block  Sij  (t)  is  uniformly 
non-vanishing. 


Definition  3  Sensing/communication  matrix  sequence  {*S'(f)}  is  said  to  be  sequentially  complete 
if  the  sequence  contains  an  infinite  subsequence  that  is  sequentially  lower-triangularly  complete. 

As  an  example  for  sequential  completeness,  let  us  assume  that  the  sensing/communication  topolo¬ 
gies  for  3  agents  are  changing  according  to  the  sequence  {S{tk),k  G  =  {1,2,...}}  defined 
below:  S{tk)  =  Si  for  k  =  Ap,  S{tk)  =  S2  for  k  =  Ap  +  I,  S{tk)  =  S'3  for  k  =  Ap  A-  2,  and 
S{tk)  =  S4  for  k  =  Ap  A- 3,  where  r/  G  M, 


■  1 

0 

0 

■  1 

1 

0  ■ 

1  — 

1 

1 

0 

,  52  = 

0 

1 

0 

_  0 

0 

1 

_  0 

0 

1 

■  1 

0 

0  ■ 

■  1 

0 

0  ■ 

0 

1 

0 

and  S4  = 

0 

1 

0 

1 

0 

1 

0 

0 

1 

(7) 


The  bitwise  union  of  S',,  z  =  1,  ■  ■  ■  ,  4  is 


(Js. 


- 1 

1 

1 - 

0 

1 

1 

I  0 

_ 1 

0 

1 

;  1 

A 

s'^,u ;  0 1 

L  ^^21 : 1 J 

It  then  follows  from  the  structure  of  IJ^  Si  that  the  corresponding  sequence  is  lower-triangularly 
complete,  and  therefore  the  switching  sensing/communication  topologies  defined  by  (|7])  is  sequen¬ 
tially  complete. 

4.2  Multiagent  Coordination  with  Discontinuous  Dynamics 

4.2.1  Issues  with  Sequential  Completeness  Condition  in  the  Presence  of  Discontinuous  Dy¬ 
namics 

Let  us  consider  the  consensus  problem  for  the  simplest  multiagent  systems  described  by  single¬ 
integrator  dynamics 

Xl  =  Ui, 
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(8) 


where  I  e  fl  =  {1,  •  •  •  ,n},  xi(t)  G  3?  is  the  state,  m;  G  3?  is  the  eontrol  input  to  be  designed.  The 
objeetive  is  to  design  ui{t)  to  aehieve  the  eonsensus  of  the  multiagent  system  ([^,  that  is, 

lim  xi{t)  =  X*,  VZ,  (9) 

t^OO 


where  x*  is  some  eonstant  denoting  the  eonsensus  value. 

For  the  eooperative  eontrol  of  multiagent  systems  ([^,  if  the  standard  design  of  Ui{t)  is  adopted  as 
given  below 

n 

-  Xi{t)),t  e  (10) 

i=i 


where 


Slj{t 


(11) 


T:=isu{tiy 

then  it  has  been  proved  in  that  the  sequential  eompleteness  of  sensing/eommunioation  matrix 
sequenee  {5'(f)}  is  the  neeessary  and  suffieient  eondition  for  eonsensus  of  multiagent  systems. 


However,  in  praetiee  it  is  often  neeessary  to  eonsider  the  following  diseontinuos  eooperative  eon¬ 
trol  of  the  form 


Mi)  =  -  xi{t)),t  (12) 

i=i 

where  is  a  nonlinear  gain  to  be  designed  based  on  the  sensing/eommunieation  topology 

S{tl)  as  well  as  the  available  boundary  values  Xj{tl)  if  sij{tl)  ^  0,  and  sgn(-)  funetion  is  defined 
as 

(  1,  ^  >  0 

sgn(2;)  =  <  0,  z  =  0 

1-1,  z<0 


Remark  1  By  including  sgn{-)  in  the  control  law  {12),  it  may  provide  benefits  to  deal  with  the 
control  of  truly  nonlinear  systems  such  as  nonholonomic  mobile  robots  0|/,  and  improve  the  con¬ 
vergence  speed  of  the  system.  In  addition,  the  control  {12)  may  reduce  the  sensing/communication 
loads  because  on  one  hand  nonlinear  gain  aij  only  relies  on  states  Xj{ty)  at  the  time  instants 
whenever  the  communication  topology  changes,  and  on  the  other  hand,  information  exchange  of 
sgn{xj(t)  —  xi{t))  may  also  significantly  reduce  the  required  transmission  capacity  compared  with 
that  of  {xj(t)  —  xi(t)).  O 

Under  eooperative  eontrol  ([T^,  the  elosed-loop  multiagent  system  beeome  system  with  diseontin- 
uous  dynamies.  The  sequential  eompleteness  of  sensing/eommunieation  network  may  no  longer 
ensure  the  eonsensus  if  the  gains  an  are  simply  designed  using  (11).  This  is  illustrated  through  the 
following  example. 
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Example  1  Suppose  we  have  3  agents.  Define  index  set  =  {1,  2,  3},  12max  =  G  12  :  xfit)  = 
=  maXjXj(t)},  and  12min  =  {2  G  12  :  xfit)  =  Xm\u(t)  =  mirij  a;j(2)}. 

Assume  that  at  time  instant  we  have  12min(2o)  =  {1}.  12max(^o)  =  {2,3},  and  the  sens¬ 

ing/communication  topologies  among  three  agents  switch  according  to  sensing/communication 
matrices  Sifisk),  Sifisk+i)  cind  S{tzk+2)  defined  below. 


S{hk)  = 

'10  0' 
1  1  0 

,  ^ (fsk+l) 

■  1  0  1  ■ 
0  1  0 

,  S(t3k+2)  — 

1 

0 

0 

0  0 

0  0  1 

0  0  1 

0  1  1 

where  fc  =  0, 1,  •  •  ■ .  It  can  be  readily  verified  that  the  matrix  sequence  Sit^k),  Sit^k+i),  S(t3k+2) 
is  sequentially  complete.  However,  the  consensus  is  not  guaranteed  if  the  standard  gain  design 
for  ttij  in  (0  is  applied  under  control  (|72]).  One  possible  scenario  is  that  according  to  the  sens¬ 
ing/communication  matrix  agent  2  receives  information  from  agent  1  and  may  converge  to 

agent  1  infinite  time  interval  ti  —  to,  thus  at  time  instant  ti,  we  could  have  12min(2i)  =  {1,2} 
and  12max(^i)  =  {3};  similarly,  according  to  Sifi),  agent  1  receives  information  from  agent 
3  and  may  converge  to  agent  3  infinite  time  t2  —  ti,  thus  we  may  have  12min(^2)  =  {2}  and 
12max(^2)  =  {1,3};  by  S(t2),  agent  3  receives  information  from  agent  2  and  may  converge  to 
agent  2  infinite  time  —  t2,  and  we  may  have  12min(^3)  =  {2,3}  and  12max(^3)  =  {!}•  This 
pattern  will  repeat  following  the  periodical  sensing/communication  matrix  sequence  {S'(2j)}.  In 
other  words,  though  within  time  interval  [tofis),  the  communication  topology  is  complete,  con¬ 
traction  mapping  is  not  established  since  we  have  Xraaxits)  =  a;max(^o)  ^nd  x^amits)  =  a;min(^o) 
from  the  above  analysis.  This  is  further  illustrated  in  figure  in  which  we  consider  three  agents 
with  controls  (12)  and  gain  aifif)  are  chosen  based  on  (11),  simulation  parameters  are  given  as 
t^k+i  —  Hk+i-i  =  0.1,2  =  1,  2,  =  0, 1,  ■  ■  ■ ,  and  initial  conditions  Xiitf)  =  0.01, 0:2(20)  =  —0.01, 

and  x^ifo)  =  0.1.  Apparently,  no  consensus  is  reached.  O 


4.2.2  Design  of  New  Discontinuous  Cooperative  Controls 


As  shown  in  examples  standard  network  topology  based  eontrol  gain  design  for  (12)  no  longer 
implies  the  eonsensus  of  multiagent  systems,  even  with  the  most-restrietive  network  conneetivity 
condition  (that  is,  fixed  and  undirected  communication).  In  this  subsection,  in  order  to  ensure  the 
multiagent  systems  consensus  with  control  ([T^  under  the  least  restrictive  sensing/communication 
condition  (that  is,  sequential  completeness  of  {5'(2|,}),  we  propose  a  new  nonlinear  piecewise  gain 
design.  The  convergence  of  the  overall  closed-loop  systems  is  proved  by  developing  a  contraction 
mapping  method  for  multiagent  systems. 


Theorem  1  Consider  the  multiagent  system  (|^  under  cooperative  control  ([72]).  Assume  that  sens¬ 
ing/communication  matrix  sequence  {*S'(2|,)}  is  uniformly  sequentially  complet^  Let  the  nonlin¬ 
ear  gain  aij  be  designed  as  follows:  for  any  agent  I, 

*The  time-varying  sensing/communication  topology  is  considered  here.  If  the  topology  becomes  fixed  after  certain 
time,  we  can  treat  it  as  a  special  case  of  switching  sensing/communication  sequence  S{tl)  with  ct  being  any  positive 
constant. 
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Figure  1 :  System  responses 


1)  if  xi{tl)  =  ma.Xj^_^f^Xj{tl)  =  minj Xj{tl),  then  o.ij{t\)  can  be  any  bounded  positive 
value. 

2)  ifxi{t\)  >  maxjg^v';  let  aijifl)  be  selected  to  satisfy  the  inequality 


3)  ifxiitl)  <  min^g^v';  ^jifk)’  selected  to  satisfy  the  inequality 

maXjgAT,  Xjitl)  -  xiifl) 


0  < 

j&J^i 


< 


Ct 


(14) 


4)  //’minjg^v;  Xj(tl)  <  Xifl)  <  maXjg_V;  let  aijifl)  be  selected  to  satisfy  the  inequality 


0  <  <  min 

jeMi 


maXjgAf;  Xj{tl)-xi(tl) 
Ct 

Ct 


(15) 


Then  consensus  of  system  (|^  is  asymptotically  achieved. 
Proof.  See  [l^. 
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Theorem  [^provides  a  general  set  of  sufficient  gain  design  conditions  for  the  asymptotical  stability 
of  discontinuous  multiagent  systems  with  directed  and  switching  sensing/communication  topolo¬ 
gies.  The  nonlinear  gain  design  conditions  (fTS])  to  ( 15 1  are  imposed  for  the  purpose  of  avoiding 
the  possible  states  oscillation  due  to  the  finite  time  state  reachability  of  dynamical  systems  driven 
by  discontinuous  functions  under  certain  communication  topologies. 

The  result  in  theorem  can  be  extended  to  the  case  with  communication  delays.  That  is,  In  the 
presence  of  sensing/communication  delays,  the  cooperative  control  in  ([T^  becomes 


Mt)  =  Xjitk  -  rij))sgn{xj{t  -  nj)  -  e  [tl,  (16) 

i=i 

where  tij  E  [0,  r]  are  time  delays  incurred  during  transmission  with  r  being  the  upper  bound  on 
latencies  of  information  transmission  over  the  network.  The  following  theorem  is  in  the  sequel. 


Theorem  2  l[26\l  Consider  the  multiagent  system  (|^  under  cooperative  control  ([7^.  Assume  that 
sensing/communication  matrix  sequence  of  {S{tl)}  is  sequentially  complete.  Let  the  nonlinear 
gain  aij  be  designed  as  follows:  for  any  agent  I, 


1)  ifxftl)  =  maxjg^v'i  ~  '^ij)  —  Xj{tl  —  tij),  then  Oiij{t\)  can  be  any  bounded 

positive  values. 


2)  ifxi{t\)  >  maxjg_V;  Xjifl.  —  rif,  let  be  selected  to  satisfy  the  inequality 


0  <  < 


xiifl) 


miUjeM  Xjitl  -  Pj) 
max{ci,  r} 


(17) 


3)  ifxiitl)  <  miujg^v'i  —  pj),  let  oiij{t\)  be  selected  to  satisfy  the  inequality 


0  <  aijitl)  < 
j&M 


maxjgAT;  Xjjtl  -  Tij)  -  xiifl) 
max{ci,  r} 


(18) 


4)  ifmivij^^Xjitl 
inequality 


Tij)  <  xiifl)  <  maXjg_V;  Xjifl  —  Tij),  let  aijlfl)  be  selected  to  satisfy  the 


0  <  aij{tl 


<  mm 


maxjgjv;  Xj{tl-Tij)-xi{tl) 
max{ct,r}  ’ 

max{ct,r} 


(19) 


Then  consensus  of  system  (|^  is  asymptotically  achieved. 
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Figure  2:  System  responses 


4.2.3  Simulation  Results  and  Experimental  Testing 


Let  us  reeonsider  example  for  the  eonsensus  of  three  agents  with  eontrol  (12)  under  the  sens- 
ing/eommunieation  topologies  S{tsk),  S{tsk+i)  and  S{tsk+2)-  We  ehoose  the  nonlinear  pieeewise 
eonstant  gain  aij  based  on  theorem[^  Under  the  same  simulation  eonditions,  system  responses  are 
shown  in  figure  and  eonsensus  is  reaehed. 


To  further  illustrate  the  benefit  of  introdueing  diseontinuous  dynamies  in  multiagent  systems,  the 
underwater  sonar  data  transmission  testing  was  eondueted  at  the  Wave  Laboratory,  Embry-Riddle 
Aeronautieal  University.  As  shown  in  figure  a  wave  tank  is  used  as  the  testbed  and  wave  maker 
generates  the  noise  environment.  Two  sonar  eommunioation  transdueers  are  used  for  sending  and 
reeeiving  data.  100  sets  of  data  are  used  in  the  testing  under  two  seenarios  of  direet  position  data 
transmission  and  indireet  position  data  transmission.  That  is,  in  the  seenario  of  indireet  position 
data  transmission,  the  sign  of  data  (binary  number)  was  transmitted.  Figures  and  show  the 
testing  resulting  for  two  seenarios,  respeetively.  The  transmitting  and  reeeiving  speed  is  S5/R5  = 
13bits/s.  It  is  apparent  that  the  error  rate  with  indireet  data  transmission  is  mueh  lower. 


5.0  RESULTS  AND  DISCUSSION:  OPTIMAL  AND  ADAPTIVE  MULTI¬ 
AGENT  COORDINATION 

In  this  seetion,  we  present  three  results  on  optimal  and  adaptive  multiagent  eoordination.  The  first 
result  is  on  the  design  of  a  praetieally  implementable  approximately  adaptive  neural  eooperative 
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Figure  3:  Wave  Tank 
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Figure  4:  Direct  Position  Data  Transmission 
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Figure  5:  Indirect  Position  Data  Transmission 


control  for  multiagent  systems  based  on  online  approximate  dynamic  programming.  The  second 
one  is  on  the  design  of  optimal  cooperative  control  based  approximate  Q-functions.  The  third  one 
is  on  a  new  distributed  adaptive  cooperative  control  method  for  consensus  tracking  of  multiagent 
systems  with  model  uncertainties. 

5.1  Value  Function  Based  Multiagent  Policy  Iteration 

In  the  study  of  cooperative  control  of  multiagent  systems,  fruitful  results  for  cooperative  control 
design  have  been  obtained  for  first-order  linear  systems  [|71  [T^  |3  El  1201,  for  second-order  linear 
systems  [|23l,  for  high-order  linear  systems  [|T6l  [281,  and  for  nonlinear  systems  [HIl  [TOl  [I2l  [TH 
1111^123,  few  results  are  available  for  optimal  cooperative  control  design.  There  appeared  some 
recent  work  in  the  study  of  optimal  cooperative  control,  such  as  those  in  [l22l[n[3l[T5l.  Nonetheless, 
it  is  still  a  challenge  to  systematically  address  the  optimal  cooperative  control  problem  for  more 
general  nonlinear  multiagent  systems,  particularly,  in  the  presence  of  model  uncertainties. 

The  result  reported  in  this  section  aims  to  deal  with  such  a  challenge.  For  multiagent  optimal  co¬ 
operative  control,  the  key  issue  is  how  to  establish  an  optimality  equation  and  find  its  solution  in 
real  time.  We  tackle  this  problem  by  considering  a  general  class  of  feedback  linearizable  nonlinear 
multiagent  systems.  We  assume  that  there  exist  admissible  cooperative  controls  for  such  kind  of 
multiagent  systems  under  the  complete  sensing/communication  condition  IfT^.  The  fixed  sens¬ 
ing/communication  topology  is  imposed  for  ease  of  design.  The  case  for  more  complicated  time- 
varying  sensing  communication  topology  will  be  treated  in  future  work.  The  optimal  cooperative 
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control  problem  is  then  formulated  as  making  all  systems  aehieve  eonsensus  while  minimizing  the 
individual  sensing/eommunieation  topology  dependent  eost  funetions.  It  is  shown  that  the  optimal 
solution  to  the  defined  problem  requires  to  solve  a  multiagent  Hamilton-Jaeobi-Bellman  (HJB) 
equation.  To  avoid  the  obstaeles  in  analytieally  solving  multiagent  HJB  equation,  we  extend  the 
online  poliey  iteration  approaeh  in  [lUll  [l25l  to  the  multiagent  ease,  and  employ  radial  basis  fune- 
tion  (RBF)  neural  networks  to  approximate  value  funetions  at  eaeh  iteration.  Through  seeking  the 
least-squares  solution  to  estimate  the  optimal  neural  weights,  a  new  approximately  adaptive  multi¬ 
agent  poliey  iteration  algorithm  is  proposed.  It  is  further  shown  that  the  proposed  adaptive  optimal 
eooperative  eontrol  approximately  solves  the  posed  optimal  eonsensus  problem.  Simulation  results 
are  provided  to  illustrate  the  effeetiveness  of  the  proposed  optimal  design. 


5.1.1  Problem  Formulation 


Consider  a  multiagent  system  whieh  has  N  members  and  eaeh  agent  assumes  the  general  nonlinear 
dynamies 

Xi  =  fi{xi)  +  gi{xi)ui,  (20) 


where  i  e  Q  =  {I,  ■■■,  N},  Xi(t)  e  3?"^  is  the  system  state,  Ui  G  SfJ'"  is  the  eontrol  input  to  be 
designed,  /*,  :  3?"^  >->■  3?"^  are  loeally  Lipsehitz  eontinuous  funetions. 


The  objeetive  is  to  design  an  optimal  eooperative  eontrol  Ui{t)  to  aehieve  the  eonsensus  of  the 

(21) 


multiagent  system  (20)  sueh  that 
while  minimizing  the  following  individual  eost  funetion  for  eaeh  agent  i. 


lim  Xi{t)  =  X*,  Vi, 

t^OO 


Xj)  +  uj RiUi  j  dt, 


(22) 


where  x*  is  some  eonstant  denoting  the  eonsensus  value,  Qij  and  Ri  are  symmetrie  and  positive 
definite  matriees,  and  (defined  in  equation  (|^)  is  a  binary  number  deseribing  the  availability  of 
the  sensing/eommunieation  information  exehange  between  the  agent  i  and  the  agent  j. 


It  should  be  noted  that  the  individual  eost  funetion  defined  in  (|22|)  is  related  to  the  measurement  of 
eloseness  of  Xi  to  Xj  and  eontrol  effort  of  agent  i.  In  eontrast,  a  multiagent  differential  graphieal 
game  problem  was  defined  in  [l24l.  in  whieh  a  eooperative  performanee  index  was  imposed  by 
ineluding  the  terms  related  to  eontrol  efforts  of  neighboring  agents.  Here  we  are  not  looking  into 
the  team  performanee  index  as  formulated  in  eooperative  game  based  solution.  Instead,  we  deal 
with  the  optimal  solution  by  looking  into  minimizing  individual  performanee  defined  in  (|22l). 


5.1.2  On  the  design  of  Approximately  Adaptive  Cooperative  Optimal  Control 
Multiagent  HJB  Equation:  The  design  starts  with  the  development  of  multiagent  HJB  equations. 
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Recall  that  the  cost  function  for  agent  i  is  defined  in  (22 ),  which  may  be  rewritten  as 


Xiito) ,  SijXj  (to)) 


X 


■:)  +  U, 


I  RiUi 


dt, 


(23) 


where  A/)  =  {j  G  7^  0}  denotes  the  neighbor  set  of  agent  i.  The  following  lemma  is 

instrumental  in  developing  the  multiagent  Hamilton-Jacobi-Bellman  (HJB)  equation. 


Lemma  1  For  admissible  cooperative  control  Ui{t),  if  there  exists  a  positive  definite  continuously 
differentiable  function  Vfxi,  SijXp  ufj  satisfying  the  following  property 

oyT  QyT 

+  g{xi)ui)  +  ^  -^{f{xj)+g{xj)uj) 

+  X]  “  XifQij{xj  -  Xi)  +  ujRiUi  =  0  (24) 

j&Mi 


and  the  boundary  condition  Vfixfoo) ,  SijXj{oo);  Ui 
tionfor  system  (20)  for  all  t,  and 


0,  then  Vfixi,  SijXjiUi)  is  the  value  func- 


Vii^Xiitof  SijXjitfjj  ufj  Xiitoj ,  SijXjitffjj 


(25) 


It  follows  from  lemma  and  Bellman’s  principle  of  optimality,  we  know  that  the  optimal  value 
function  V*{xi{t),  SijXj{t))  approximately  satisfies  for  small  A  — )■  0 

V*{xi{t),SijXj{t))  ~  mm[l{xi{t),Xj{t),Ui)A  +  Vfixfit  +  A),  SijXj{t  +  A)],  (26) 

Ui 


where  fixffj^xfitj^uf)  =  “  XjYQifixi  -  xf  +  ujRiUi,  xft  +  A)  ~  xft)  +  {f{xi)  + 

g{xi)ui)A,  and  Xjft  +  A)  ~  Xj(t)  +  {f{xj)  +  g{xj)uj)A. 

The  corresponding  optimal  cooperative  control  can  be  derived  as 


u. 


1 

2' 


dVf 

dxi 


(27) 


and  the  mutliagent  HJB  equation  is 


J6M 


4  dxi 


dxi 


f)V*^ 

+  +  9{xj)uj), 


j&Fi 


(28) 
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with  the  associated  boundary  condition  V*{x*,  Sijx*)  =  0,  which  requires  that  the  optimal  value 
must  be  null  when  evaluated  on  an  extremal  trajectory  (all  agents  in  the  set  {i,  A/i}  reach  consen¬ 
sus.) 


The  solution  to  (28 1  would  provide  the  optimal  cooperative  control  in  (27 ).  However,  it  is  difficult 


to  solve  mainly  for  two  reasons.  First,  equation  ([28])  is  a  nonlinear  partial  differential  equation, 
and  it  is  in  general  impossible  to  solve  this  equation  in  analytic  form.  Second,  the  coupling  terms 


cause  extra  difficulty  due  to  involvement  of  Uj  which  may  require 
information  propagation  from  agents  not  in  the  neighboring  set  A/). 


The  Proposed  Multiagent  Policy  Iteration  Algorithm:  The  proposed  multiagent  policy  iteration 
algorithm  consists  of  the  following  two  steps: 


Step  1:  Policy  evaluation.  Find  an  admissible  cooperative  control  policy  Uifi{xi,  SijXj).  For  any 
integer  I  >  0  denoting  the  iteration  index,  solve  for  Vi^i{xi,  SijXj;  u^)  using 


with  Vi^i{x* ,  SijX*)  =  0. 

Step  2:  Policy  improvement.  Update  the  control  policy  by 

_  _  1  p_i 


dxi 


(30) 


The  convergence  of  the  multiagent  policy  iteration  algorithm  given  in  ( 29 )  and  ( 30 1  is  summarized 
into  the  following  theorem. 


Theorem  3  If  a  sequence  of  pairs  {Vi^i,  u^+i}  is  generated  by  (29)  and  (30),  then  the  correspond¬ 
ing  value  function  Vi^i  satisfying 

(31) 


and 


lim  =  V* 
1^00 


(32) 


Proof:  See  OTlI. 

For  the  ease  of  implementation,  the  policy  evaluation  step  in  the  proposed  multiagent  policy 
iteration  algorithm  can  be  replaced  by  following  equation  (33)  for  solving  for  Vi^i  based  on  the 
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available  information  Xi{t),Xj{t)  and  during  the  time  interval  [t,t  +  T]. 


Vi^i{xi{t),SijXj{t)-,Ui^i{t))  =  /  (  i^j  -  Xi)'^Qij{xj  -  Xj)  +  uJiRjUi^i  )  dt 

VieM  ’  J 

+Vi^i{xi{t  +  T),  SijXjit  +  T)]u^{t  +  T)),  (33) 

where  T  >  0  represents  certain  time  interval. 

Approximately  Adaptive  Cooperative  Optimal  Control:  A  significant  advantage  of  the  pro¬ 
posed  multiagent  policy  iteration  algorithm  is  that  it  iteratively  generates  a  sequence  of  pairs 
through  (33)  and  (30)  by  only  using  the  available  local  information  Xi,Xj  and  Ui  for 
agent  i,  which  monotonically  converges  to  the  optimal  value  V*  and  u*.  It  is  apparent  that  the  key 
is  to  solve  for  Vi^i  from  (33 ).  For  the  unknown  value  functions  Vi^i{xi,  SijXj),  we  use  the  following 
neural  network  approximator. 


Vi^i{xi,  SijXj)  =  +  Uijixi),  Vxi  e  n* 


(34) 


where  Xi  =  [sjiXi,  Si2X2,  ■  ■  ■  ,  SijXj,  •  •  •  ,  SiNX^]'^,  6*i  e  is  an  unknown  constant  pa¬ 

rameter  vector,  the  neural  network  node  number  U  >  1,  Uij{xi)  is  the  approximation  error,  and 
^i,iixi)  =  0*2,0  •  •  •  ,  the  known  basis  function  vector. 


Upon  using  the  function  approximator  (|34|),  the  policy  evaluation  equation  in  (|33])  becomes 

rt+T 

'^{Xj  -  Xi)'^Qij{xj  -  Xi)  -f  uJiRiUij  I  dt  =  [^ij{xi{t))  -  ^i,i{xi{t  +  T))]^^*^ 


+Ulij{t), 


(35) 


where  ujij{t)  =  ujij{t)  -  ujij{t  +  T). 


It  follows  from  (35)  that  9*i  provides  the  best  approximate  solution  for  the  policy  evaluation. 
However,  its  value  is  unknown,  and  needs  to  identified  online.  Let  Oij(t)  be  the  estimate  of  9*i, 
and  equation  becomes 


rt+T  / 


ij{xj  -  Xi)  -f  uJjRiUij  dt 


=  [<I>i,z(a;i(f))  -  ^ij{xi{t  +  T))]'^9ij{t)  -f  eij{t), 


(36) 


where  e*,/(f)  =  [$i,i(x*(f))  -  <hi,«(x*(f  -f  T))f9ij{t)  +  u}ij{t),  9ij{t)  =  9*i  -  9ij{t).  Thus,  given 
any  admissible  cooperative  control,  the  parameter  9ij  should  be  chosen  to  minimize  the  squared 
approximation  residual  error  e^i{t).  As  9ij{t)  9* i,  it  is  obvious  that  eij{t)  — >■  cuij. 

In  what  follows,  we  present  the  proposed  adaptive  law  for  9ij  using  the  least-squares  estimation. 
Let  us  define 


Zi{tk)  = 


Xi)^Qij{xj  -  Xi)  -f  uJiRiUij  dt 
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and 


Substituting  this  into  (36)  yields 


(37) 


The  model  in  (37)  is  the  regression  model  for  poliey  iteration  and  T'*,;  is  called  the  regressor. 
Through  the  observation  interval  [tk,  pairs  of  observations  and  regressors  {{zi{tk+iM), 

/i  =  0,l,n  —  1}  are  obtained  by  using  control  policy  Ui^i.  The  parameter  9i^i  will  be  chosen  to  min¬ 
imize  the  least-squares  loss  function 

n—1 


L{e^,tk)  =  {zi{tk+^,)  -  '^i,l{tk+^lVOi^ly 


(38) 


(1=1 

To  this  end,  standard  least-squares  estimation  algorithm  renders 

where  =  [2:^(4), 2;i(4+i),  ^i(4+„_i)]'^,  and  =  ['^li{tk)r  ■  ■  ■  Thus, 

dxi 

(39) 


d'V-  u^- 

according  to  policy  improvement  step  in  (|30|),  and  noting  the  control  law  is 


1 

Ui,i+i  =  --R,  g, 


The  above  results  can  be  summarized  into  the  following  proposition. 

Proposition  1  Under  assumption  of  complete  sensing/communication  topology,  the  control  law 
(|39|)  with  adaptive  law  (|3^  approximately  solves  the  optimal  cooperative  consensus  problem  for 
multiagent  nonlinear  system 


by  minimizing  the  cost  function  ( 22 ). 

In  summary,  the  proposed  approximately  adaptive  multiagent  policy  iteration  (MPI)  algorithm  is 
given  in  Algorithm  1 . 


Algorithm  1  Approximately  Adaptive  MPI  Algorithm 

1:  Let  I  =  0.  Given  initial  states  Xi{to),  SijXj{to),  let  the  initial  admissible  cooperative  control 
policy  be  Uifl. 

2;  Employ  the  control  policy  Ui^i,  and  within  the  observation  interval  [f/xn,  f(z+i)n-i],  collect  the 
data  pairs 

{{.^i  illxn+(i}  ^  ^*,0  (f/xn+zi)  ) )  d  0,1, 77.  1} 


3 

4 

5 


Solve  for  9i^i  from  ([38). 

Solve  for  from  (^. 

Let  I  I  +  1,  and  go  to  step  2,  until 


where  e  >  0  is  a  sufficiently  small  predefined  threshold. 
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5.1.3  Simulation  Results 


To  illustrate  the  proposed  approximately  adaptive  eooperative  eontrol,  we  eonsider  a  simple  mul¬ 
tiagent  system  with  3  agents  modeled  by  the  following  single  integrators 


Xi  =  Ui,  i  =  1,  2,  3 


(40) 


where  x*  e  3?,  and  Ui  G  3?.  Let  the  sensing/communication  topology  among  3  agents  be  given  by 


S  = 


1  1  0 
0  1  1 
1  0  1 


Apparently,  S  matrix  is  eomplete,  and  admissible  eooperative  eontrol  exists  for  the  eonsensus  of 
d^.  Seleet  the  weight  matriees  in  as  Qij  =  l,Ri  =  0.25  for  simulation  purpose.  We  use  a 
single  neural  node  approximator  for  eaeh  value  funetion  1^ Based  on  S  matrix,  we  ehoose  the 
basic  functions  as  =  (xi  —  X2)^,  ^2,1  =  {^2  —  and  $3,/  =  (xs  —  xi)^  for  value  functions 
Vi,h  ^2,1  and  Vs^i,  respectively.  System  initial  states  are  xi(0)  =  0.5,  X2(0)  =  0.2  and  X3(0)  =  0.3. 
Figure  shows  that  system  states  consensus  is  achieved,  figure  displays  the  instantaneous  cost 
values.  Figure  illustrates  the  optimal  cooperative  control  inputs,  and  the  convergence  of  neural 
network  weights  estimates  is  shown  in  figure]^ 


Figure  6:  Consensus  of  Xi{t) 
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Figure  7 :  Instantaneous  eost  values  versus  time 


Figure  8:  Optimal  eooperative  eontrols 
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Figure  9:  Parameters  of  neural  networks  versus  time 


Simulation  was  also  done  for  the  following  three  agents  with  nonlinear  models,  and  the  eorre- 
sponding  results  in  figures  [T^to  [^illustrate  the  performance  of  the  proposed  optimal  coordination 
algorithm. 


Agent  1:  Xi  =  —Xi  +  sin(a;i)  +  Ui 
Agent  2:  X2  =  -X2  -  sin(a;2)  +  U2 
Agent  3:  is  =  Ms 


5.2  Approximate  Q-Function  for  Multiagent  Coordination 

There  are  two  main  issues  with  the  proposed  value  function  based  multiagent  policy  iteration 
algorithm.  First,  the  computation  of  requires  to  know  system  dynamics  gi.  Second,  the 
implementation  of  estimation  algorithm  requires  an  excitation  condition  for  matrix  which 

might  cause  difficult  in  selecting  basis  functions  for  linear  approximators.  To  address  those  issues, 
we  propose  the  following  new  cooperative  Q-iteration  algorithm. 
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Figure  10:  System  responses 


Figure  1 1 :  Cooperative  control  inputs 
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Figure  12:  Performance  Index 


Figure  13:  Parameters  esimation 
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5.2.1  Problem  Statement 


Consider  multiagent  systems  with  more  general  nonaffine  discrete-time  model 

Xi{k  -h  1)  =  fi{xi{k),Ui{k)) 

The  individual  cost  function  for  agent  i  is 

oo  /  N 
l=k  \j=l 

Q-function  is  a  state-action  value  function,  which  gives  the  value  obtained  when  starting  from  a 
given  state,  applying  a  given  action,  and  following  a  policy  thereafter.  The  Q  function  for  agent  i 
is  given  as  follows: 

N 

Qi{xi{k),  SijXj{k),Ui{k))  =  '^{xi{k)  -  Xj{k))'^ SijQij{xi{k)  -  Xj{k))  +  uj {k)RiUi{k) 

i=i 

+Vi{xi{k  -f  1),  SijXj{k  -f  l),Ui{k  +  1)) 

Correspondingly,  the  optimal  Q  function  id  defined  as 

N 

Q*{xi{k),SijXj{k),Ui{k))  =  '^{xi{k)  -  Xj{k)Y SijQij{xi{k)  -  Xj{k))  +  uj {k)RiUi{k) 

i=i 

+V*{xi{k  +  l),SijXj{k  +  l),Ui{k  +  1)) 

and  the  optimal  cooperative  coordination  control  protocol  is 

u*{xi{k),SijXj{k))  =  argmm^^Q*{xi{k),SijXj{k),Ui{k)) 

5.2.2  Proposed  Coordination  Algorithm 

To  this  end,  the  proposed  Q  function  based  approximate  multiagent  policy  iteration  algorithm  is 
summarized  as  two  steps. 

•  Policy  evaluation 

-  According  to  Bellman  equation 

N 

Ql^i^i{xi{k),SijXj{k),u^{k))  =  '^{xi{k)  -  Xj{k))'^ SijQij{xi{k)  -  Xj{k)) 

j=i 

+uli{k)RiUi^i{k)  +  Qi,i{fi,  Sijfj,  Ui^i{k)) 
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-  The  parameter  vector  is  obtained  by  a  projection  mapping  P 


=  P{Q 


+ 

^,/+l 


) 


•  Policy  improvement 


Ui^i+i  =  argmin„^(5i,i+i 


Specifically,  by  using  available  on-line  data  Xi{k),Xj{k),Ui{k),Xi{k  +  l),ri{k  +  l),Ui{k  +  1),  the 
following  temporal  difference  Q-Iteration  can  be  used  in  the  multiagent  Q-function  policy  iteration 
algorithm 

Qi,k+i{xi{k),Ui{k))  =  Qi,k{xi{k),Ui{k))  +  ak[ri{k  +  1)  +  lQi,k{xi{k  +  l),Ui{k  +  1)) 

-Qi,k{^i{k),Ui{k))]  (41) 

However,  for  systems  with  large  and  continuous  spaces,  Q-functions  in  terms  of  state-action  pairs 
will  have  infinite  number  of  pairs,  there  is  no  way  to  learn  and  explore  all  those.  Therefore  we 
propose  to  use  the  parametric  approximation  of  Q-functions,  that  is,  let 

{Xi^  SijXj  ^  XljPOi 

where  4>i  basis  functions,  6i  parameter  vector  to  be  estimated.  To  this  end,  the  model-free  approxi¬ 
mate  multiagent  Q-learning  algorithms  can  be  summarized  as  follows. 

1:  Measure  initial  states  Xi{tQ),  SijXj{tQ). 

2:  Initialize  parameter  vector,  6*j(0)  =  0. 

3:  Let  Mi(0)  =  0 

4:  For  every  time  set  /c  =  0, 1,  2,  •  •  ■  ,  do 

5:  Apply  Ui{k),  measure  Xi{k  +  l),Xj{k  +  1),  rj(/c  -f  1) 

6: 

Ui{k  +  1)  =  ^9i{k) 

UXli 

7: 


6i{k  +  l)  =  6i{k)  +  ak[ri{k  +  l)+^(t)J{xi{k  +  l),Ui{k  +  l))6i{k) 

-(pf  {xi{k),Ui{k))9i{k)]  (pi{xi{k),Ui{k)) 

8:  end  for 

5.2.3  Simulation  Results 

To  illustrate  the  proposed  design,  we  apply  the  approximate  Q-learning  algorithm  to  the  consensus 
control  of  the  following  three  agents 

Xi{k  -f  1)  =  Xi{k)  +  Tui{k),T  =  0.05 


25 


with  the  Sensing/Communication  Topology 


S  = 


1  1  0 
0  1  1 
1  0  1 


The  reward  funetion  ri{k  +  1)  are 

ri{k  +  1)  =  —5{x2{k)  —  Xi{k)Y  —  0.01ui{kY 

r2{k  +  1)  =  -5{x3{k)  -  X2{k)f  -  0.01u2{k)‘^ 

r3{k  +  1)  =  -5{xi{k)  -  X3{k)y  -  0.01u3(/c)^ 

and  the  approximate  Q-funetions 


Ql(xi,  Ml)  =  -{{X2  -  Xif  +  (X2  -  Xi)mi  +  ul)9i 


Qi{x2,  U2)  =  -((xs  -  X2f  +  (xs  -  a;2)M2  +  uI)92 
Ql{xz,  U3)  =  -{{xi  -  +  (xi  -  X3)U3  +  uI)93 

The  eorresponding  eooperative  eontrols  are 


with  adaptation  laws 


ui{k) 

U2{k) 

U3{k) 


1 

2RD  +  29i{k) 
1 

2'Rd  +  292{k) 
1 

2Rd  +  293{k) 


{xi{k)  -  X2{k))9i{k) 

{x2{k)  -X3{k))92{k) 
{x3{k)  -xi{k))93{k) 


9i{k  +  l)  =  9i{k)  +  ak[ri{k  +  l) +^(l)^{xi{k  +  l),Ui{k  +  l))9i{k) 

-(j)J {xi{k),Ui{k))9i{k)]  (j)i{xi{k),Ui{k)) 


Simulation  parameters:  T  =  0.05;  Qd  =  10;  Rd  =  5;  7  =  0.98;  a  =  0.5.  The  simulation 
results  are  shown  in  figures  [T4|-[T7j 


5.3  Adaptive  Consensus  Tracking  for  Uncertain  Multiagent  Systems 

5.3.1  Problem  Statement 

Consider  the  multiagent  systems  in  which  the  ith  agent  is  described  by  the  scalar  differential  equa¬ 
tion 

Xi  =  ttiXi  +  giUi  (42) 
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Figure  14:  System  responses 


Figure  15:  Cooperative  control  inputs 
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Figure  16:  Performance  Index 


Figure  17:  Parameters  estimation 
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where  and  Qi  are  constants,  G  3fJ  state,  and  Ui  &  R  control  input.  We  first  consider  the  case  of 
tti  being  unknown  and  gi  is  known.  For  simplicity,  we  simply  assume  that  Qi  =  1.  Let  the  reference 
trajectory  be  described  by  the  first-order  differential  equation 

xq  =  aoxo  +  roit)  (43) 

where  unknown  constant  oq  <  0,  ro(t)  is  a  piecewise-continuous  bounded  function  of  time  pa¬ 
rameterized  by  ro(t)  =  where  basis  functions  0(f)  =  [0i(f),  02(f),  •  ■  ■  ,  ^ 

available  to  all  agents,  and  parameters  w  =  [wi,  ^2,  •  ■  ■  ,  wiY  ^  are  unknown  constants.  The 
control  objective  is  to  design  adaptive  control  Ui  such  that 

lim  Xiit)  =  a;o(f). 

t^OO 


5.3.2  Proposed  Adaptive  Control 

To  proceed,  let  us  define  the  Laplacian  matrix  L  for  the  sensing/communication  S'(f)  as  follows. 

L  =  diag  j  -  S{t) 

We  also  assume  that  the  reference  state  Xo{t)  is  available  to  at  least  one  agent  through  sens¬ 
ing/communication  detection,  and  this  is  described  by  a  diagonal  matrix  B  given  below 

B  =  diag  {6*0} 

where  bio  >  0  means  that  agent  i  has  the  information  a;o(f).  Let  dj  be  the  parameter  estimate  of 
a*  =  ao  —  tti  for  agent  i,  Wij  be  the  estimate  of  Wj  by  agent  i,  Wi  =  [wu,  •  •  •  ,  wuY-  The  control 
input  for  agent  i  is  chosen  to  be 

Ui  =  diXi  4-  0^(f)wi  (44) 

Defining  the  tracking  error  X  =  [xi,  •  •  ■  ,  XnY  =  [xi  —  xq,  ■  ■  ■  ,  Xn  —  xqY^  the  parameter  errors 
di  =  di  —  a* ,  Wi  =  Wi  —  Wi  =  [wii,  •  •  •  ,  wuY,  we  obtain  the  error  equation  as 

Xi  =  aoXi  -f  diXi  +  (Y'wi  (45) 

and  the  overall  error  dynamics  as 

i 

X  =  aoX  +  Xd  +  ^iW^i  (46) 

i=l 

where  d  =  [di,  •  ■  ■  ,  d^]^,  w^i  =  [wu,  ■  ■  ■  ,  WniY,  =  diag[a;i,  •  ■  ■  ,  Xn],  and  $1  =  diag[0i(f),  •  ■  ■  , 

0i(f),--  -  ,0i(f)]. 

We  further  define  consensus  error  e*  =  YljaMi  ~  +  ^*0(2^0  —  Xi).  It  thus  can  be  verified 

that 

{L  +  B)X  =  [ci,  •  •  •  ,  Cj,  •  •  •  ,  CnY  (47) 
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Let  us  consider  the  Lyapunov-like  function 


1  ~  1  .  ^ 

V  =  -X^(L+B)X+-^r„ 


I  n 


2  +  2 

2  =  1  j  =  l  2  =  1 


(48) 


where  Lq.  >  0,  L^.  >  0.  The  time  derivative  of  V  along  the  trajectories  of  (46)  is  given  by 


I  n 


V  =  X^{L  +  B)X  +  Y,  +  Y.Y1  r-. 

2  =  1  j  =  l  2=1 


ij  WijWij 


(49) 


Choosing  the  adaptive  laws 


di  XixCi 

(50) 

^ij  = 

(51) 

for  z  =  1,  •  • 

■  ,  n,  j  =  1,  ■  ■  ■  ,  and  then  we  obtain 

V  =  aoX^(L  +  B)X  <  0 

(52) 

which  implies  X,  Oj,  Wij  G  £oo-  Also  X  G  £2  and  X  G  C^o,  which  further  implies  that  X  — )■  0  as 
f  — )■  00.  The  main  result  is  summarized  into  the  following  theorem. 


Theorem  4  Consider  the  multiagent  system  in  (42).  If  the  sensing/communication  topology  S{t) 
is  connected,  and  B  has  at  least  one  entry  being  nonzero,  then  the  distributed  adaptive  cooperative 
control  in  ([??])  together  with  the  adaptive  laws  in  ( SO )  and  ([37])  guarantee  the  boundedness  of  all 
signals  of  the  closed-loop  system  and  achieve  asymptotical  consensus  tracking. 


5.3.3  Simulation  Results 

The  proposed  adaptive  cooperative  control  is  simulated  for  the  following  multiagent  system  with 
three  agents 

Xi  =  2xi  +  Ml 


±2=  X2+  U2 


X3  =  5X3  +  U3 

Assume  the  Informed  Agent:  xq  =  —XQ+r{f),r{t)  =  2  cos(f)+3  cos(2f).  The  sensing/communication 
matrix  S  and  leader  information  matrix  B  are  given  by 


- 1 

1 

1 

0 

■  1 

0 

1 - 

0 

1 

1 

1 

,  B  = 

0 

0 

0 

0 

_ 1 

1 

1 

0 

0 

- 1 

0 

30 


The  corresponding  adaptive  consensus  control  is 


Ui  =  QiXi  +  Wii  COs{t)  +  Wi2  cos(2t) 


withe  adaptive  laws 

dj  =  kxiC^  Sij{xj  -  Xi)  +  hiQ{xo  -  Xi)) 
j 

Wii  =  k  cos(t)(^  Sij{xj  -  Xi)  +  bio{xo  -  Xi)) 

3 

Wi2  =  kcos{2t)C^Sij{xj  -  Xi)  +  bio{xo  -  Xi)) 

3 

Simulation  results  are  given  in  figures  [TSp^  which  illustrate  the  effectiveness  of  the  proposed 
design. 
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Figure  18:  The  trajectory  of  informed  agent 


6.0  RESULTS  AND  DISCUSSION:  MULTIAGENT  COORDINATION  AP¬ 
PLICATIONS 

In  this  section,  we  present  two  case  studies  for  multiagent  coordination  tasks.  One  is  for  formation 
control  of  mobile  robots,  and  the  other  is  for  coverage  control  of  mobile  agents. 
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Figure  19:  The  trajectories  of  agents 


Figure  20:  Adaptive  cooperative  controls 
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Figure  21:  Parameter  estimate  for  wi 


Figure  22:  Parameter  estimate  for  W2 
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Figure  23:  Parameter  estimate  for  a 


6.1  Multiagent  Formation  Control 

6.1.1  Problem  Statement 

Consider  a  network  of  multiple  nonholonomie  mobile  robots  with  the  individual  system  dynamies 
given  by 

Xi  =  Vi  cos  9i, 

iji  =  Uisin6'i,  (53) 

Si  oji 

where  i  G  =  {1,  •  •  ■  ,n},  {xi,yi)  G  3?^  denotes  the  ith  robot’s  position,  6*j  is  the  orientation, 
Vi  G  3?  driving  veloeity,  and  cji  G  3fJ  the  steering  veloeity. 

The  design  objeetive  of  this  paper  is  to  eoordinate  the  motion  of  individual  robots  to  follow  a 
desired  trajeetory  eontour  while  maintaining  eertain  preseribed  geometrie  formation  shape  through 
loeal  information  exehange  among  robots.  By  taking  the  whole  group  of  mobile  robots  as  a  virtual 
body  moving  along  the  desired  trajeetory,  formation  shape  of  robots  in  the  group  ean  be  determined 
by  a  set  of  loeal  eoordinates  with  referenee  to  the  moving  frame  attaehed  to  the  desired  trajeetory. 

More  speeifieally,  let  go(f)  =  [2:o(f),  ?/o(f)]^  ^  3?^  be  the  desired  trajeetory  for  the  group  motion, 
the  moving  frame  attaehed  to  q'o(f)  can  be  defined  by  the  following  orthonormal  veetors  ei{t) 
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and  62  (t) 


61  (t) 


eii(t) 


e2(t) 


621 (t) 

622 (^) 


■  ^o(^) _  ■ 

+  [yo{t)? 

yo{t) 

.  v^^ioWFTl^oW  - 

■ _ ^o(^)  ■ 

Vi^oitW  +  [yoit)? 

xojt) _ 

-  +  [1/0 (t)] 2  . 


Accordingly,  any  formation  consisting  of  n  robot  positions  in  )  can  be  expressed  as  {Pi ,  ■  •  ■  ,  Pn } 
with 

Pi{t)  =  aiiei{t)  +  ai2e2{t),  (54) 

where  aij  are  constants  of  determining  the  formation.  To  this  end,  the  formation  control  objective 
can  be  recast  as  to  design  the  control  laws  Vi{t)  and  uJi{t)  for  the  ith  robot  such  that 


lim 

t^OO 


Xi(t) 

yiit) 


-  qoit)  -  Piit) 


0. 


(55) 


6.1.2  Proposed  Linearization-Based  Control 

To  facilitate  the  control  design,  the  robot  model  ([TT])  is  first  converted  into  a  linear  model  as 

Zii  =  Mil,  Zi2  =  Ui2  (56) 


where  zn  =  Xj  +  Pcos  6**,  Zi2  =  yi  +  Psin  9i,  and 


Vi 

cos  9i 

sin  9i 

Uil 

Ui  _ 

— 

sin  6i 

COS  9i 

.  _ 

L  R 

R  J 

To  this  end,  the  proposal  new  coordination  control  is  of  the  form 

Uii  =  ^  aij{t)sgn  I  Zji  -  Zii  +  ^{au  -  aji)ei  |  +  go  +  pf 

j&^fi  V  1=1 

Ui2  =  ^  aij(f)sgn  I  Zj2  -  Zi2  +  ^{au  -  aji)ei  |  +  go  +  pf 

isM  \  1=1 


(57) 

(58) 


in  which  the  control  gains  are  chosen  based  on  the  following  guidelines:  for  any  agent  /, 

1)  if  zi{tl)  =  maXjg_Vi  Zj{tl)  =  minjg_V;  Zj{tl),  then  aij{tl)  can  be  any  bounded  positive  value. 


35 


2)  if  zi{tl)  >  maxjg_Vi  be  selected  to  satisfy  the  inequality 

ziitl)  -  minjeAT,  Zj{tl) 


0  < 

j&J^i 


< 


Ct 


3)  if  zi{tl)  <  minjg_v,  let  be  selected  to  satisfy  the  inequality 

maXjeM  Zj{tl)  -  Ziitl) 


0  < 


< 


Ct 


4)  if  minjg_V;  Zj{tl)  <  Zi{tl)  <  maXjg_v,  Zj{tl),  let  aij{tl)  be  selected  to  satisfy  the  inequality 

maxjgAf;  Zj{tl)-zi{tl) 


0  <  aijitl)  <  min  | 


Ct 


6.1.3  Simulation  and  Experimental  Results 


We  conducted  the  experimental  study  to  verify  the  proposed  design.  In  the  experiment,  we  use 


4  Amigobot  robots  (see  figure  24)  and  the  sensing/communication  topology  among  robots  is  as¬ 
sumed  to  be 

110  0 
0  110 
0  0  11 
10  0  1 


5  = 


The  testing  results  validated  the  proposed  design. 


Figures  25  and  26  show  the  snapshots  of  experiments  conducted  using  AmigoBots  at  Robotics 
Lab,  Bethune-Cookman  University. 

Figure]^ shows  the  formation  shape  changing  from  rectangle  to  line,  then  to  rhombus,  and  finally 
converging  to  one  point. 


6.1.4  Proposed  Nonlinear  Model-Based  Control  with  Limited  Information  of  Desired  Tra¬ 
jectory 


It  is  noted  that  the  control  objective  defined  in  (55 )  can  be  achieved  through  the  standard  tracking 
control  design  for  individual  robots  if  the  desired  trajectory  go(f)  and  its  derivative  qo^t)  are  avail¬ 
able  to  every  robot.  However,  such  a  design  may  not  be  robust  in  the  presence  of  disturbance  and 
noise  measurements  due  to  the  lack  of  coordination  among  robots.  On  the  other  hand,  the  desired 
trajectory  qo{t)  may  be  known  only  by  some  of  robots  in  the  group.  Therefore,  it  is  desirable  to  de¬ 
sign  distributed  formation  control  law  for  the  Ah  robot  based  on  information  exchange  and  relative 
position  measurement  between  robots  within  its  sensing/communication  range. 
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P3-AT 


AmigoBot 


Figure  24:  P3-AT  and  AmigoBots 


Square  to  Line 


Line  to  Square 


Figure  25:  Reetangle-to-line  and  line-to-reetangle  formation  eontrol  with  undireeted  eommuniea- 
tion 


Format  from  Rectangle  to  line  to  Rhombus  End  in  a  Point 


Figure  26:  Formation  changes  from  rectangle  to  line  to  rhombus  and  ends  with  a  point  with  undi¬ 
rected  communication 


Distributed  Observers  for  Desired  Trajectory 

The  proposed  new  formation  control  is  done  with  the  aid  of  distributed  observes  for  the  estimation 
of  qo{t).  The  proposed  distributed  observer  is  of  the  form  (for 

Xi,o{t)  =  aijSgn{xjfl{tl)  -  Xi^o{tl))  +  aioSioSgn{xo{tl)  -  Xi^o{tl))  (59) 

yi,o{t)  =  aijSgVL{yjfl{tl)  -  Viflitl))  +  aioSioSgn(|/o(ffc)  -  l/i,o(^D)  (60) 

j&Mi 

Oifiit)  =  ^  aijSgn{ejfi{tl)  -  +  aioSioSgn(6'o(tfc)  -  Oi^^itl))  (61) 

jeA/; 

where  Xifl{t),  and  are  the  ith  robot’s  estimate  of  xo(f),  yo{t),  and  6*o(f),  respectively, 
Sjo  =  1  if  and  only  if  the  ith  robot  has  the  direct  access  to  the  information  of  the  desired  trajectory, 
ajj  and  Ojo  are  piecewise  constant  control  gains  to  be  designed,  and  sgn(-)  function  is  defined  as 

(  1,  z>0 

sgn(2;)  =  <j  0,  z  =  0 

(  -1,  z<0 

Nonlinear  Formation  Control  with  Limited  Information  of  a  Desired  Trajectory 

The  proposed  new  design  is  summarized  into  the  following  proposition. 
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Theorem  5  Consider  a  group  of  nonholonomic  mobile  robots.  Let  the  distributed  cooperative 
control  be  for  t  G  [to  +  kTg,  to  +  {k  +  l)Ts) 


Unit)  =  +  a^2  sinci;(t  —  to  ~  kT^) 

Ui2{t)  =  iJf  +  b%  cosa;(t  -  to  -  kTg) 


where  a;  =  i?,  0^2  7^  0  can  be  any  constant,  and 


'^Gij{k)[xj{k)  -  xfk)  -  x'ji^k)  +  xf{k  +  1)], 


j=i 


i=i 


h’^  - 

"i2  — 


2uj 

,k 


i=i 

0‘iiZi2{k)Ts  + 


It  k  2 

5;  G.j(k)[p,(k)  -  p.(k)  -  pf(k)  +  pf(k  +  1)]  - 
“ahiiT, 


U 


with 


xfik) 

yfik) 


^ijik)  . _ 

sr^n  /  7  \  )  J  1)  ‘  ‘ 

Xifi{k)  +  an  cos  9ifl{k)  -  0*2  sin  9ifi{k) 
Vigik)  +  an  sin  9ifl{k)  +  Q!i2  cos  9ifi{k) 


(62) 

(63) 

(64) 

(65) 

(66) 

(67) 


xf(fc  +  l)  =  Xifi{k  +  1)  +  an  cos  9ifl{k  +  1)  -  ai2  sin  9ifi{k  +  1) 
yf{k  +  l)  =  Pi fi{k  +  1)  +  an  sin  9ifi{k  +  1)  +  ai2  cos  9i^o{k  +  1) 


where  Xi^,  yi^,  and  9ifl  are  governed  by  (59),  (60),  and  (61 ),  respectively,  and 

Xifi{k  +  1)  =  Xifi{k)  +  Ts{'^  aijSgn{xjfl{k)  -  Xifl{k))  +  aioSioSgn{xo{k)  -  Xi^o(k))X68) 


j&Mi 


yi,o{k  +  1)  =  yi,o{k)  +  Ts{^  aijSgn{yjfl{k)  -  yip{k))  +  aioSioSgn{yo{k)  -  yip{k)))  (69) 


j&Mi 


hi^oik  +  1)  =  9ifl{k)  +  TsC^  aijSgn{9jfl{k)  -  9ifl{k))  +  aioSiosgn{9o{k)  -  9i^o{k)))  (70) 

j&M 


Then  the  formation  control  objective  (55)  is  achieved. 
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6.1.5  Simulation  Results 


We  simulate  the  proposed  formation  eontrol  by  eonsidering  three  mobile  robots  moving  aeeording 
to  a  eircular  contour  while  maintaining  a  right  triangle  formation. 


Let  q'o(^)  be  [sin(0.2f),  —  cos(0.2f)]^.  The  corresponding  moving  frame  is  given  by  ei(f)  = 
[cos(0.2f),  —  sin(0.2f)]'^,  e2(f)  =  [sin(0.2f),  cos(0.2f)]^.  The  formation  parameters  are  given 
by  ttii  =  0,ai2  =  0,021  =  — 1,022  =  1,031  =  —1,032  =  —1.  The  initial  conditions 
[xi{to),yi{to),9i{to)]  are  given  by  [0.1,  0.2,  vr^,  [1, -2,  tt/O],  [-1, -1.5,  0]^  for  i  =  1,2,3,  re¬ 
spectively,  4  =  0.2  and  Ts  =  0.1.  Figure  illustrates  the  phase  portrait  under  the  proposed 
controls  proposed  controls  ([6^  and  ([6^. 


Figure  27 :  Phase  portrait  of  three  robots 


6.2  Multiagent  Coverage  Control 

Coverage  control  aims  to  address  the  issue  of  deployment  of  sensor  networks  for  tasks  like  mon¬ 
itoring  an  environment,  environment  modeling,  search  and  rescue,  and  so  on.  In  recent  years, 
mobile  autonomous  agents  have  been  applied  in  the  construction  of  mobile  sensor  networks  due  to 
their  flexibility  and  resilience  to  dynamically  changing  environments. 

In  this  section,  we  propose  a  distributed  deployment  algorithm  for  a  group  of  mobile  robots 
to  cover  a  convex  region.  The  individual  mobile  robot  considered  has  kinematic  constraints. 
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and  may  only  exchange  information  locally  with  its  neighboring  counterparts  due  to  its  lim¬ 
ited  sensing/communication  range.  The  proposed  deployment  algorithm  iteratively  updates  the 
Voronoi  partition  through  local  information  exchange,  and  then  moves  toward  its  centroid  based 
on  centroid-driven  control  algorithms. 

6.2.1  Problem  Statement 

To  solve  the  autonomous  deployment  problem,  we  make  the  following  assumptions  without  loss 
of  generality: 

•  The  robots  have  the  knowledge  of  the  area  to  be  covered  and  sensed. 

•  The  robots  have  limited  sensing  ranges  r^,  and  limited  communication  ranges  Tc.  That  is, 
only  points  in  a  circle  centered  at  the  current  robot’s  position  and  of  radius  can  be  sensed 
by  the  robot.  In  addition,  at  time  t,  robot  i  can  communicate  with  its  neighboring  robot  j, 
j  G  Afiit)  =  {j\dij  <  Tc},  where  dij  is  the  distance  between  the  ith  robot  and  the  jth  robot. 

•  For  a  given  region,  there  are  enough  number  of  n  mobile  robotic  agents  to  completely  cover 
the  area. 

To  this  end,  the  multiagent  coverage  control  problem  is  formulated  as  designing  a  distributed  de¬ 
ployment  control  algorithm  to  move  the  robots  towards  the  centroid  of  the  corresponding  parti¬ 
tioned  regions  based  on  the  minimization  of  certain  coverage  cost  functions.  Under  the  aforemen¬ 
tioned  assumptions,  the  coverage  control  problem  has  at  least  one  solution.  In  this  section,  a  new 
paradigm  is  proposed  to  deploy  the  robots  by  assuming  limited  sensing  and  communication  ranges. 

6.2.2  Proposed  Coverage  Control 

The  kinematic  model  of  mobile  robotic  agent  carrying  sensors  is  described  by  the  following  equa¬ 
tions: 

Xi  =  Vi  cos  9i, 

iji  =  ViSixiOi,  (71) 

Oi  oji 

where  i  G  U  =  (1,  •  ■  ■  ,  n},  Pi  =  [xi,  yi]'^  G  3?^  denotes  the  ith  robot’s  position,  9i  is  the  orien¬ 
tation,  Vi  e  ^  driving  velocity,  and  G  the  steering  velocity.  The  optimal  coverage  control 
problem  is  then  defined  as  designing  distributed  cooperative  control  Vi  and  Ui  such  that  agents 
converge  to  optimal  positions  p*  by  minimizing  certain  cost  function. 

The  proposed  deployment  algorithm  is  a  recursive  one.  At  each  sampling  time  instant,  each  robot 
first  computes  its  Voronoi  cell  based  on  its  communication  with  neighboring  robots,  then  deter¬ 
mine  the  centroid  of  its  Voronoi  region,  and  then  moves  towards  it  by  employing  a  distributed 
coordination  algorithm. 
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In  what  follows,  we  describe  the  basic  idea  of  Voronoi  partition  based  coverage  control  for  mobile 
robots.  Let  us  denote  an  arbitrary  point  in  the  region  Q  as  q.  At  each  sampling  time  instant,  the 
agents  will  be  able  to  generate  the  Voronoi  partition  of  Q.  That  is,  for  agent  i  at  position  pi,  its 
Voronoi  region  satisfies 


Vi  =  {qe  Q\\\q-Pi\\  <  \\q-Pj\\yj  ^  i} 

Define  cost  function  over  the  region  as 

,Pn)  =  ^  7:\\q  -  Pif(l){q)dq 


(72) 


(73) 


where  (j){q)  is  a  weighting  function  of  importance  over  Q.  The  distance  function  ^\\q  —  is 
included  in  the  cost  function  for  the  consideration  of  reducing  energy  consumed  by  a  sensor’s 
transceiver  because  it  is  generally  a  function  of  distance.  In  addition,  the  reliability  of  the  informa¬ 
tion  at  q  measured  by  robot  at  pi  will  degrade  with  the  increase  of  distance  ||7  — 

At  each  sampling  time  instant,  the  generation  of  Voronoi  region  Vi  for  robot  i  is  based  on  the  robots 
in  its  neighboring  set  A/).  That  is,  robot  i  can  only  use  the  position  information  of  the  robots  in  its 
communication  range  Tc  to  compute  V).  This  is  a  realistic  situation  since  during  the  motion,  the 
robot  could  move  in  or  out  the  communication  range  which  is  limited. 

Once  the  Voronoi  region  is  obtained,  a  simple  control  to  drive  the  robot  to  the  centroid  of  the 
Voronoi  region  is  to  follow  negative  gradient  of  cost  function  J,  that  is. 


dp, 


^  -  {q  -  Pi)(t>{q)dq 


'Vi 


However,  as  discussed  before,  the  kinematic  model  in  fTT])  is  nonlinear  and  may  not  be  able  to 
follow  the  negative  gradient  due  to  velocity  constraints.  A  simple  way  to  avoid  this  issue  is  to 
conduct  input/output  linearization  by  choosing  a  reference  point  off  the  robot  center  (x*,  yi),  that 
is,  let  the  cartesian  coordinates  of  the  off-center  reference  point  be 


Pii  =  Xi  +  b  cos  9i 
Pi2  =  Vi +  b  sin  9i 


(74) 

(75) 


where  6  >  0  is  a  constant.  Differentiating  (fM])  and  (75 1  with  respect  to  time,  we  have 
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(76) 


To  this  end,  the  distributed  deployment  control  for  robot  i  is  given  by 

dJ 


A 

Xli 


Ui2 


=  -^  =  -  J  (q  -  Pi) (l>iq)dq  =  -MvXCvi  -  Pi) 


(77) 
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where  mass  Mv^  is  given  by 


Mv,  =  [  (t){q)dq 


(78) 


the  first  moment 


and  the  eentroid 


(79) 

(80) 


Onee  Ui  is  obtained,  the  eontrol  inputs  Vi  and  oji  ean  be  ealeulated  by  using  inverse  input  transfor¬ 
mation  given  below: 

Vi 
Ui 


=  T-\e) 


Vil 

Ui2 


6.2.3  Simulation  Results 


In  this  seetion,  we  simulate  the  proposed  distributed  deployment  algorithm.  Consider  first  the  ease 
with  5  mobile  robotie  agents,  and  we  assume  fully  eonneeted  eommunieation  topology.  That  is,  at 
eaeh  time  instant,  eaeh  robot  has  the  position  information  of  the  rest  members  in  the  group.  Figure 

partition,  respeetively.  Figure  [^illustrates  of  the  evolution  of  the  robots. 


28  and  ^  illustrate  the  initial  loeation  with  Voronoi  partition  and  the  final  position  with  Voronoi 


In  the  2nd  ease,  we  eonsider  10  robots  with  limited  eommunieation  ranges.  Assume  that  the  initial 
eommunieation  topology  is  defined  by 
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and  ehanges  subsequently  based  on  system  evolution.  Figure  [3T]  and  \32\  illustrate  the  initial  loea¬ 
tion  with  Voronoi  partition  and  the  final  position  with  Voronoi  partition,  respeetively.  Figure  [^ 
illustrates  of  the  evolution  of  the  robots. 
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Figure  28:  Initial  location  and  Voronoi  partition 


Figure  29:  Final  location  and  Voronoi  partition 
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Figure  30:  Evolution  of  the  robots 
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Figure  3 1 :  Initial  loeation  and  Voronoi  partition 
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Figure  32:  Final  location  and  Voronoi  partition 
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Figure  33:  Evolution  of  the  robots 
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7.0  CONCLUSIONS 


In  this  report,  we  have  presented  a  comprehensive  description  of  the  research  results  obtained 
through  this  project.  Specifically,  technique  details  have  been  provided  for  three  sets  of  multia¬ 
gent  coordination  algorithms  which  solve  the  distributed  task  coordination  problem  while  ensuring 
system  stability,  accommodating  the  least-restrictive  sensing/communication  conditions,  handling 
system  uncertainties,  and  guaranteeing  near-optimal  performance.  Computer  simulation  and  ex¬ 
perimental  results  are  presented  to  illustrate  the  effectiveness  of  the  proposed  algorithms. 
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A  Distributed  Deployment  Algorithm  for  Mobile  Robotic  Agents  with 
Limited  Sensing/Communication  Ranges 

Jing  Wang,  Christopher  Smith,  Gennady  Staskevich,  and  Brian  Ahhe 


Abstract — In  this  paper,  we  propose  a  distributed  deploy¬ 
ment  algorithm  for  a  group  of  mobile  robots  to  cover  a 
convex  region.  The  Individual  mobile  robot  considered  has 
kinematic  constraints,  and  may  only  exchange  information 
locally  with  its  neighboring  counterparts  due  to  its  limited 
sensing/communication  range.  The  proposed  deployment  algo¬ 
rithm  iteratively  updates  the  Voronoi  partition  through  local 
information  exchange,  and  then  moves  toward  its  centroid  based 
on  centroid-drive  control  algorithms.  Particularly,  in  addition  to 
gradient-based  centroid-drive  control  algorithm  in  which  input- 
output  linearization  has  been  applied  to  robot  model,  a  new 
algorithm  based  on  distributed  consensus  is  proposed  to  directly 
address  the  kinematic  constraint  associated  with  robot  model. 
Simulation  results  are  provided  to  Illustrate  the  effectiveness  of 
the  proposed  algorithm. 

1.  Introduction 

Coverage  control  aims  to  address  the  issue  of  deployment 
of  sensor  networks  for  tasks  like  monitoring  an  environment, 
environment  modeling,  search  and  rescue,  and  so  on  [7]  [2]. 
In  recent  years,  mobile  autonomous  agents  have  been  applied 
in  the  construction  of  mobile  sensor  networks  [3][6][15][8] 
due  to  their  flexibility  and  resilience  to  dynamically  changing 
environments. 

Generically,  in  solving  coverage  control  problem  using 
multiple  mobile  agents,  two  fundamental  issues  have  to  be 
addressed:  i)  how  to  optimally  assign  subregions  for  each 
agent  based  on  sensing/communication  capabilities,  online 
updates  about  importance  of  the  region  to  be  surveyed, 
and  information  exchange  among  agents;  ii)  how  to  design 
control  algorithms  to  drive  agents  to  the  desired  deployment 
locations.  In  addressing  problem  i),  the  typical  method  is  to 
generate  Voronoi  diagram  for  agents  by  minimizing  certain 
cost  functions  related  to  distances  between  the  deployment 
positions  and  the  measuring  points  [3].  The  importance  of 
region  can  be  embedded  into  cost  function  through  density 
function,  which  could  be  analytically  unknown  and  can 
be  adaptively  estimated  using  measurement  data  [15].  For 
problem  ii),  the  standard  gradient-based  control  can  be  used 
if  agents  assume  linear  dynamics. 

In  this  paper,  we  propose  a  distributed  deployment  algo¬ 
rithm  for  a  network  of  mobile  robotic  agents  with  kinematic 
constraints,  which  is  a  common  characteristic  for  typical 
mobile  robots  such  as  differential  drive  robots  and  car-like 
robots.  In  addition,  we  consider  more  realistic  scenarios 
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of  that  robots  have  limited  sensing/communication  ranges, 
and  solving  deployment  problem  through  local  informa¬ 
tion  exchange  among  agents.  The  proposed  work  relies 
on  the  distributed  coordination/consensus  results  for  mul¬ 
tiagent  systems  [11] [10] [12].  Distributed  coordination  of 
multiagent  systems  has  been  an  active  research  topics  in 
recent  years.  Fruitful  results  are  available  for  the  study  of 
cooperative  control  design  and  network  connectivity  analysis 
[4][12][14][1 1].  In  [11],  we  proposed  sequential  complete¬ 
ness  condition  for  sensing/communication  matrix  sequences 
connectivity  for  ensuring  coordination  of  multiple  dynamical 
systems.  On  the  other  hand,  cooperative  control  and  forma¬ 
tion  control  have  been  studied  extensively  for  both  linear 
and  nonlinear  dynamical  systems  [14]  [5].  However,  there  is 
still  a  lot  of  work  to  be  done  for  dealing  with  coordination 
control  of  mobile  robots  with  kinematic  constraints. 

The  proposed  deployment  control  algorithm  in  this  paper 
follows  a  two-step  strategy.  First,  at  each  time  instant, 
Voronoi  partition  for  each  robot  is  generated  based  on  robot’s 
current  position  as  well  as  the  positions  of  robots  in  its  com¬ 
munication  range.  Then  control  algorithms  are  designed  to 
drive  robots  to  centroids  of  Voronoi  partitions.  In  particular,  a 
distributed  centroid-drive  algorithm  is  proposed  by  explicitly 
taking  into  account  kinematic  model  constraints  for  robots. 
Unicycle  robots  are  used,  and  the  distributed  control  is  based 
on  the  transformation  of  unicycle  into  chained  form  [17][16]. 
Under  the  assumption  of  robots  maintaining  a  sequentially 
complete  communication  topology,  the  proposed  distributed 
deployment  control  algorithm  solves  the  posed  coverage 
control  problem.  Simulation  results  are  included  to  illustrate 
the  effectiveness  of  the  proposed  design. 

II.  Problem  Formulation 

In  this  paper,  we  shall  consider  the  problem  of  deploying 
a  fixed  number  of  mobile  robotic  agents  in  a  given  convex 
environment  Q.  An  illustration  example  is  shown  in  figure 
1,  in  which  three  robots  start  from  some  initial  positions, 
and  through  coordination  eventually  move  to  points  [1,3]^, 
[2, 1]^,  and  [3, 3]^,  respectively,  to  cover  a  square  area  Q  = 
4x4  unit^.  Each  robot  has  the  sensing  range  rg  =  2.2  unit. 

To  solve  the  autonomous  deployment  problem,  we  make 
the  following  assumptions  without  loss  of  generality: 

.  The  robots  have  the  knowledge  of  the  area  to  be  covered 
and  sensed. 

.  The  robots  have  limited  sensing  ranges  r^,  and  limited 
communication  ranges  Tc-  That  is,  only  points  in  a  circle 
centered  at  the  current  robot’s  position  and  of  radius 
Tg  can  be  sensed  by  the  robot.  In  addition,  at  time  t. 
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Fig.  1.  Deployment  of  3  robots  in  a  square  area  (3  =  4x4  unit‘d  with 
sensing  range  rs  =  2.2  unit 


robot  i  can  communicate  with  its  neighboring  robot  j, 
j  G  JVi{t)  =  {j\dij  <  Tc},  where  dij  is  the  distance 
between  the  zth  robot  and  the  jth  robot. 

•  For  a  given  region,  there  are  enough  number  of  n  mobile 
robotic  agents  to  completely  cover  the  area. 

To  this  end,  the  multiagent  coverage  control  problem  is 
formulated  as  designing  a  distributed  deployment  control 
algorithm  to  move  the  robots  towards  the  centroid  of  the 
corresponding  partitioned  regions  based  on  the  minimization 
of  certain  coverage  cost  functions.  Under  the  aforementioned 
assumptions,  the  coverage  control  problem  has  at  least  one 
solution.  In  this  paper,  a  new  paradigm  is  proposed  to  deploy 
the  robots  by  assuming  limited  sensing  and  communication 
ranges. 

A.  Robot  Modeling 

The  kinematic  model  of  mobile  robotic  agent  carrying 
sensors  is  described  by  the  following  equations: 

Xi  =  Vi  cos  9i, 

i/i  =  ViSinOi,  (1) 

Oi  =  LUi 

where  j  S  U  =  {I,--  -  ,n},  pi  =  [xi,yi\'^  G  3?^  denotes 

the  ith  robot’s  position,  9i  is  the  orientation,  Vi  G  ^  driving 
velocity,  and  Ui  G  ^  the  steering  velocity.  The  optimal  cover¬ 
age  control  problem  is  then  defined  as  designing  distributed 
cooperative  control  Vi  and  oJi  such  that  agents  converge  to 
optimal  positions  p*  by  minimizing  certain  cost  function. 

Remark  2.1:  The  model  (1)  has  the  so-called  nonholo- 
nomic  constraints  [9].  For  such  a  system,  there  is  no  contin¬ 
uous  state  feedback  control  to  solve  its  stabilization  problem 
due  to  the  violation  of  Brockett’s  necessary  condition  [1]. 
Therefore,  it  becomes  even  more  challenging  to  address  the 
optimal  deployment  problem  of  multiple  mobile  agents  with 
nonholonomic  constrains. 

III.  Proposed  Deployment  Algorithm 

In  this  section  we  present  a  distributed  deployment 
algorithm  for  mobile  robotic  agents  with  limited  sens¬ 
ing/communication  ranges.  The  proposed  deployment  algo¬ 
rithm  is  a  recursive  one.  At  each  sampling  time  instant. 


each  robot  first  computes  its  Voronoi  cell  based  on  its 
communication  with  neighboring  robots,  then  determine  the 
centroid  of  its  Voronoi  region,  and  then  moves  towards  it  by 
employing  a  distributed  coordination  algorithm. 


A.  Voronoi  Partition 

In  solving  coverage  control  problem  for  sensor  networks, 
Voronoi  diagram  has  been  popular  in  generating  the  deploy¬ 
ment  positions  for  sensor  nodes  [3].  In  what  follows,  we 
describe  the  basic  idea  of  Voronoi  partition  based  coverage 
control  for  mobile  robots. 

Let  us  denote  an  arbitrary  point  in  the  region  Q  as  q. 
At  each  sampling  time  instant,  the  agents  will  be  able  to 
generate  the  Voronoi  partition  of  Q.  That  is,  for  agent  i  at 
position  Pi,  its  Voronoi  region  satisfies 

Vi  =  {qG  Q\\\q-Pt\\  <  ||g-Pil|,Vj  ^  i}  (2) 

Define  cost  function  over  the  region  as 

"  /■  1 

J{pi.  ■  ■  ■  ,Pn)  =  Y1J  2^\d  -  P^\\'^4'{Q)dq  (3) 


where  is  a  weighting  function  of  importance  over  Q. 
The  distance  function  \\\q  —  PiW^  is  included  in  the  cost 
function  for  the  consideration  of  reducing  energy  consumed 
by  a  sensor’s  transceiver  because  it  is  generally  a  function 
of  distance.  In  addition,  the  reliability  of  the  information  at 
q  measured  by  robot  at  pi  will  degrade  with  the  increase  of 
distance  ||(jf  —  Pi|p. 

At  each  sampling  time  instant,  the  generation  of  Voronoi 
region  Vi  for  robot  i  is  based  on  the  robots  in  its  neighboring 
set  JVi-  That  is,  robot  i  can  only  use  the  position  information 
of  the  robots  in  its  communication  range  Tc  to  compute  Vi. 
This  is  a  realistic  situation  since  during  the  motion,  the  robot 
could  move  in  or  out  the  communication  range  which  is 
limited.  It  is  apparent  that,  by  only  considering  the  robots 
in  its  communication  range,  the  obtained  Voronoi  partition 
could  be  different.  For  instance,  consider  a  robot  at  location 
[2,1]^  computing  its  Voronoi  region  for  a  square  area  Q  = 
4x4  unit'^ .  Figure  2  and  figure  3  show  the  resulting  Voronoi 
region  under  3  neighboring  robots  and  2  neighboring  robots, 
respectively. 

Once  the  Voronoi  region  is  obtained,  a  simple  control  to 
drive  the  robot  to  the  centroid  of  the  Voronoi  region  is  to 
follow  negative  gradient  of  cost  function  J,  that  is. 


dpi 


{q  -  Pi)(j){q)dq 


However,  as  discussed  before,  the  kinematic  model  in  (1) 
is  nonlinear  and  may  not  be  able  to  follow  the  negative 
gradient  due  to  velocity  constraints.  A  simple  way  to  avoid 
this  issue  is  to  conduct  input/output  linearization  by  choosing 
a  reference  point  off  the  robot  center  {xi,  pi),  that  is,  let  the 
cartesian  coordinates  of  the  off-center  reference  point  be 


Pit  =  Xi  +  bcos9i  (4) 

P%2  =  Vi  +  b  sin  9i  (5) 
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Fig.  2.  Voronoi  region  for  robot  at  [2, 1]^  with  3  neighboring  robots 


Fig.  3.  Voronoi  region  for  robot  at  [2, 1]^  with  2  neighboring  robots 


where  6  >  0  is  a  constant.  Differentiating  (4)  and  (5)  with 
respect  to  time,  we  have 


Pil 

COS0 

— 6sin0 

Vi 

Pi2 

sin0 

hcos0 

A 


T{0) 


Vi 

A 

Uil 

UJi 

. 

(6) 


To  this  end,  the  distributed  deployment  control  for  robot  i  is 
given  by 


A 

= 


Uil 

Ui2 


dpi 


-  [  {q-  Pt)Hq)dq  =  -Mvi  (Cvi  -  Pi)  (7) 

JVi 


where  mass  My.  is  given  by 

^Vi  =  [  (t){q)dq 

JVi 


the  first  moment 


Lvi  =  /  q(l^{q)dq 

JVi 


(8) 

(9) 


and  the  centroid 


Cy.  = 


Lvi 

Mi 


(10) 


Vi 


Once  Ui  is  obtained,  the  control  inputs  Vi  and  oji  can  be 
calculated  by  using  inverse  input  transformation  given  below: 


Vi 

UJ,. 


=  T-\0) 


Uil 

Ui2 


B.  Deployment  Algorithm  Based  on  Distributed  Consensus 

In  this  subsection,  a  distributed  deployment  algorithm  by 
directly  dealing  with  nonlinear  model  in  (1)  is  presented. 
That  is,  we  propose  a  new  control  to  drive  robots  to  Voronoi 
centroids  based  on  distributed  consensus  algorithms. 

To  start,  we  first  convert  (1)  into  the  following  canonical 
chained  form 

Zil  =  Uil 

Zi2  =  Ui2  (11) 

Zi3  =  Zi2Uii 


by  using  the  state  and  input  transformations  defined  below 


Zil  =  Xi, 
Uil  = 


Zi2=tan6»i,  Zi3  =  yi, 

ViCOSOi,  Ui2=  - 

cos^  Oi 


(12) 

(13) 


The  controls  un  and  Ui2  will  be  designed  and  the  corre¬ 
sponding  Vi  and  uji  can  be  obtained  through  the  inverse 
transformation  of  (13). 

To  facilitate  the  design,  we  apply  a  binary  matrix  C (f)  to 
describe  the  time-varying  communication  topologies  among 
robots,  that  is,  given  a  time  sequence  :  rj  =  0, 1,  •  •  •  }, 
C{t)  is  defined  by 


C{t) 


Cll  Ci2{t) 

C2l{t)  C22 


Vln  (f  ) 

^271  (f ) 


Vnlil)  Vji2(l')  ‘  ‘  ‘  Vjin 


(14) 


with  C{t)  =  C{t^)yt  e  where  cu  =  1;  c^j{t)  = 

1  if  the  jth  robot  is  in  the  sensing/communication  range 
of  the  ith  robot  at  time  t,  and  Cij  =  0  if  otherwise;  and 
Iq  =  Iq.  It  can  be  assumed  without  loss  of  any  generality 
that  0  <  c^  <  <  C(  <  00,  where  c^  and  q  are 

constant  bounds. 

Remark  3.1:  At  each  sampling  time  instant,  finite-time 
steering  control  can  be  used  to  move  robots  to  the  centroids 
of  Voronoi  regions.  In  what  follows,  in  order  to  further  im¬ 
prove  the  robustness  against  measurement  errors,  we  propose 
a  distributed  coordination  algorithm  based  on  information 
exchange  among  neighboring  robots. 

The  proposed  design  is  based  on  distributed  consensus 
idea,  which  requires  that  the  communication  topologies 
defined  by  (14)  satisfy  sequential  completeness  condition 
[11][17].  The  sequential  completeness  condition  describe  the 
least  required  condition  on  network  connectivity  for  cooper¬ 
ative  control  design,  which  is  equivalent  to  the  existence  of 
a  spanning  tree  introduced  in  [12]. 

Assumption  3.1:  The  group  of  robots  defined  in  (1)  has  a 
sequentially  complete  communication  network. 
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In  what  follows,  we  present  the  distributed  deployment 
algorithm  under  assumption  3.1.  Define  an  infinite  sequence 
of  time  instants  {fg  +  kT^}  for  k  G  Af  =  {0, 1,  •  •  •  }  and 
with  sampling  time  0  <  Tg  <  c^.  The  control  inputs  will 
be  updated  according  to  the  sampling  time  instants.  For 
notational  convenience,  zit^  +  kTs)  is  simplified  as  z{k)  for 
any  variable  z.  At  each  time  instant  k,  the  Voronoi  partition 
Vi  is  obtained  for  robot  i  based  on  local  information,  then 
its  centroid  [Cv^^xik),Cv-^y{k)]'^  is  computed,  and  then  the 
distributed  centroid  control  algorithm  is  used  for  moving 
robots  to  the  desired  positions.  The  proposed  distributed 
centroid  control  algorithm  is  summarized  as  follows. 

Let  the  distributed  centroid  control  be  for  t  G  [fg  + 
kTg^  Iq  +  {k  +  l)T's) 


Uii{t)  =  ail  +  ai2  sin uj{t  —  to  —  kTs)  (15) 

Ui2(t)  =  bii  +  bi2  cos  -  to  -  kTs)  (16) 

where  w  =  ^,  0*3  ^  0  can  be  any  constant,  and 


=  i^'^Gij{k)[xj(k)  -  Xi{k)  -  Cy.^xik) 


Tg  .  ^ 

~i~Cvi,x{k)] 
1 


^  H  -  Za{k)], 


b^  - 

^i2  — 


2uj 


(17) 

(18) 


~i~^Vi,y{k)]  — 


Y^G.i{k)[yi{k)-y,{k)-Cx,,y{k) 

i=i 

2 


—ciiiZi2{k)Ts  + 


(19) 


with 


G^iik)  = 


Cij  (k) 


EUc^vik) 


,  j  =  1, 


,n. 


(20) 


In  the  use  of  algorithms  (15)  and  (16),  the  centroid 
[Cvi,x{k),Cv^^y{k)]'^  at  each  step  will  be  generated  using 

(loi’ 

In  summary,  the  proposed  distributed  deployment  algo¬ 
rithm  is  given  in  Algorithm  1. 


Algorithm  1  Distributed  Deployment  Algorithm 


1:  Let  k  =  0.  Given  initial  states  Pi{k),  CijPj{k),  calculate 
the  initial  Voronoi  partition  Vi{k). 

2:  Compute  the  centroid  C'ui,y(fc)]^  using  (10). 

3:  Employ  control  (7)  or  (15)-(16). 

4:  Let  fc  •(—  fc  -f  1,  and  go  to  step  2,  until 

iCv^Ak+^)-Cx.Ak)f+iCv,.y{k+l)-Gx^,y{k)f)  <  e, 

where  e  >  0  is  a  sufficiently  small  predefined  threshold. 


IV.  Simulation 

In  this  section,  we  simulate  the  proposed  distributed  de¬ 
ployment  algorithm.  Consider  first  the  case  with  5  mobile 
robotic  agents,  and  we  assume  fully  connected  communica¬ 
tion  topology.  That  is,  at  each  time  instant,  each  robot  has  the 
position  information  of  the  rest  members  in  the  group.  Figure 
4  and  5  illustrate  the  initial  location  with  Voronoi  partition 
and  the  final  position  with  Voronoi  partition,  respectively. 
Figure  6  illustrates  of  the  evolution  of  the  robots. 


Fig.  4.  Initial  location  and  Voronoi  partition 


Fig.  5.  Final  location  and  Voronoi  partition 


In  the  2nd  case,  we  consider  10  robots  with  limited  com¬ 
munication  ranges.  Assume  that  the  initial  communication 
topology  is  defined  by 


C'(O) 


1  1 
0  1 
0  0 
0  0 
0  0 
0  0 
0  0 
0  0 
0  0 
1  0 


0  0 
1  0 
1  1 
0  1 
0  0 
0  0 
0  0 
0  0 
0  0 
0  0 


0  0 
0  0 
0  0 
1  0 
1  1 
0  1 
0  0 
0  0 
0  0 
0  0 


0  0 
0  0 
0  0 
0  0 
0  0 
1  0 
1  1 
0  1 
0  0 
0  0 


0  0 
0  0 
0  0 
0  0 
0  0 
0  0 
0  0 
1  0 
1  1 
0  1 
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Fig.  6.  Evolution  of  the  robots 


Fig.  9.  Evolution  of  the  robots 


and  changes  subsequently  based  on  system  evolution.  Figure 
7  and  8  illustrate  the  initial  location  with  Voronoi  partition 
and  the  final  position  with  Voronoi  partition,  respectively. 
Figure  9  illustrates  of  the  evolution  of  the  robots. 


Fig.  7.  Initial  location  and  Voronoi  pailition 


Fig.  8.  Final  location  and  Voronoi  partition 


V.  Conclusion 

In  this  paper,  we  proposed  a  distributed  deployment  al¬ 
gorithm  for  solving  the  coverage  control  problem  of  mo¬ 
bile  robotic  agents  with  inherent  kinematic  constraints.  The 
proposed  design  assumes  the  limited  sensing/communication 
capabilities,  and  the  generation  of  Voronoi  partition  for  each 
robot  as  well  as  the  centroid-drive  control  are  based  on 
local  information  exchange  among  agents.  Simulation  results 
validated  the  effectiveness  of  the  proposed  design.  Future 
work  will  be  focused  on  experimental  validation  of  the 
proposed  algorithm  by  using  Kilobots  [13]. 
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Discontinuous  Cooperative  Control  for  Consensus  of  Multiagent 
Systems  with  Switching  Topologies  and  Time-Delays 
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Abstract — In  this  paper,  we  propose  a  discontinuous  co¬ 
operative  control  for  consensus  of  multiagent  systems  with 
directed  and  switching  sensing/communication  topologies  and 
time-delays.  By  introducing  a  new  design  for  nonlinear  co¬ 
operative  control  gains,  multiagent  system  consensus  can  be 
guaranteed  in  the  presence  of  switching  topologies  and  time- 
delays.  System  convergence  analysis  is  done  by  employing  a 
new  contraction  mapping  method.  Simulation  examples  are 
provided  to  illustrate  the  effectiveness  of  the  proposed  design. 

1.  Introduction 

Cooperative  control  of  multiagent  systems  has  attracted 
a  great  deal  of  attention  in  recent  years  [18][14][1][19]. 
Multiagent  systems  are  generically  defined  as  a  group  of 
dynamical  systems  in  which  certain  emergent  behaviors  are 
exhibited  through  the  local  interaction  of  group  members  that 
individually  have  the  capability  of  self-operating.  Fundamen¬ 
tally,  the  key  issues  in  engineered  multiagent  systems  are  the 
study  of  network  controllability  and  the  design  of  distributed 
cooperative  control.  In  terms  of  network  controllability,  the 
objective  is  to  figure  out  the  connectivity  conditions  on  sen¬ 
sor/communication  topologies  of  the  network  for  achieving 
consensus  behavior.  In  [8] [20],  the  condition  is  obtained  for 
composite  undirected  graphs  which  need  to  be  connected. 
Extensions  were  made  in  [17]  [9]  to  the  case  with  directed 
graphs,  and  the  less  restrictive  conditions  are  stated  as  that 
there  exists  a  spanning  tree  or  the  network  is  strongly 
connected  periodically.  Complement  to  the  aforementioned 
graph-theoretical  methods,  a  matrix-theoretical  framework 
is  developed  in  [16]  to  deal  with  the  high-order  systems 
with  arbitrary  but  finite  relative  degrees.  It  is  shown  that 
network  controllability  is  ensured  if  and  only  if  the  sens¬ 
ing/communication  network  is  sequentially  complete. 

The  design  of  cooperative  control  is  closely  related 
to  system  dynamics.  For  linear  systems,  the  results  in 
[8]  [17]  [9]  [6]  [20]  are  developed  for  the  first-order  integrator 
model,  in  [22]  for  double  integrator  model,  and  in  [16][23] 
for  high-order  linear  model.  For  nonlinear  systems,  the 
problem  becomes  complicated  since  network  controllability 
may  not  render  the  direct  design  of  cooperative  control  and 
system  dynamics  have  to  be  explicitly  taken  into  account.  In 
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[11],  a  solution  is  obtained  by  convexity  analysis  for  a  class 
of  discrete-time  nonlinear  systems.  The  continuous-time  non¬ 
linear  systems  are  also  addressed  such  as  in  [10][13][15]. 
Particularly,  a  subtangentiality  condition  on  the  vector  fields 
is  identified  in  [10].  In  [13],  the  local  passivity  condition 
is  imposed  on  nonlinear  functions  in  the  system  dynamics, 
while  a  diagonally  quasi-linear  functions  of  positive  gains 
is  introduced  in  [15].  In  addition,  time  delays  are  literally 
analyzed  in  [13][15]  for  continuous-time  nonlinear  systems. 
It  should  be  noted  that  the  results  in  [10][13][15][24]  are 
for  nonlinear  systems  with  smooth  dynamics.  There  also 
appeared  some  pioneering  work  on  consensus  of  systems 
with  discontinuous  dynamics  [5]  [3]  [2] [7]  [4]  by  using  the 
tools  from  nonsmooth  analysis  [21]  [12].  Discontinuous  con¬ 
trol  law  was  proposed  for  the  coordination  of  nonholonomic 
mobile  robots  in  [5].  The  finite-time  semistable  concept  was 
introduced  in  [7]  for  a  class  of  switched  rendezvous  proto¬ 
cols.  The  results  in  [3]  [2]  addressed  the  distributed  estimation 
and  tracking  problem  using  a  variable  structure  approach,  and 
a  binary  consensus  control  protocol  was  designed  in  [4]  via 
a  pin  node. 

In  this  paper,  we  propose  a  new  discontinuous  cooperative 
control  design  for  multiagent  systems  with  switching  and 
directed  sensing/communication  topologies.  The  case  in  the 
presence  of  sensing/communication  delays  is  also  rigorously 
addressed.  Particularly,  we  developed  a  contraction  mapping 
method  for  the  consensus  analysis  of  multiagent  systems 
under  the  proposed  discontinuous  cooperative  control.  The 
proposed  discontinuous  cooperative  control  design  provides 
a  possible  way  to  address  the  cooperative  control  problem 
with  more  complicated  system  dynamics,  and  enriches  the 
disposal  for  cooperative  control  protocols.  The  contributions 
of  the  paper  are  two-fold.  First,  it  reveals  that  network 
controllability  condition  does  not  guarantee  the  consensus  in 
the  presence  of  discontinuous  system  dynamics.  Second,  it  is 
rigorously  proved  that  through  designing  nonlinear  piecewise 
control  gains,  the  convergence  can  be  ensured  for  multiagent 
systems  with  switching  topologies  and  time-delays  under  the 
least-restrictive  network  controllability  condition  of  that  the 
system  sensing/communication  topologies  are  sequentially 
complete.  Simulation  examples  are  provided  to  illustrate  the 
effectiveness  of  the  proposed  design. 

II.  Problem  Formulation 

Consider  a  multiagent  system  which  has  n  members  and 
each  agent  assumes  the  single-integrator  dynamics 

Xi=Ui,  (1) 
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where  i  =  {1,  •  •  •  ,n},  Xi{t)  €  5ft  is  the  state,  €  5ft 
is  the  control  input  to  be  designed.  The  objective  of  this 
paper  is  to  design  a  discontinuous  cooperative  control  Ui{t) 
to  achieve  the  consensus  of  the  multiagent  system  (1),  that 
is, 

lim  Xi(f)  =  x*,  Vi,  (2) 

t—¥00 

where  x*  is  some  constant  denoting  the  consensus  value. 

To  reach  the  consensus,  the  control  design  will  be  based 
on  the  sensing/communication  information  exchange  among 
agents,  which  can  be  described  by  the  following  sens¬ 
ing/communication  matrix  and  its  corresponding  time  se¬ 
quence  {tl  :  k  =  0, 1,  •  •  •}.  That  is,  within  time  interval 
[tltl+i)’  the  sensing/communication  topology  is  assumed 
to  be  unchanged. 


S{tl)  = 

Sii 

S2l(f|) 

sMti)  ■ 
S22 

••  Siqitl)  ■ 
••  S2q{tl) 

.  Sqlitl) 

Sq2{tl)  • 

^qq 

S{t)  =  S{tl),  (3) 

where  su  =  1;  Sij{t)  =  1  if  the  ith  agent  can  receive  the 
information  from  the  jth  agent  at  time  t,  and  Sij  =  0  if 
otherwise;  and  fg  =  to-  The  neighbor  set  of  agent  i  is  defined 
as  Afi  =  {j  G  n|sij  ^  0}.  We  further  assume  without  loss 
of  any  generality  that  0  <  —  t^  <Ct  <  oo,  where 

C(  and  Ct  are  constant  bounds. 

The  proposed  cooperative  control  is  of  the  form 

n 

=  '^0!u{sii{tl.),xi{tl))sgn{xi{t)  -  Xi{t)),  (4-) 
1^1 

where  is  a  nonlinear  gain  to  be  designed  based  on 

the  sensing/communication  topology  as  well  as  the 

available  boundary  values  xi{tl)  if  su{tl)  ^  0,  and  sgn(-) 
function  is  defined  as 

[  1,  z>0 

sgn(z)  =  <^  0,  z  =  0 

[  -1,  z<0 

III.  Main  Results 

Assume  that  the  multiagent  system  (1)  is  operating  under 
switching  and  directed  sensing/communication  topologies. 
That  is,  sensing/communication  matrix  is  changing, 

and  not  necessary  be  symmetric  (in  general  Sy(f|)  ^ 

To  proceed  with  the  design  and  stability  analysis  for  the 
closed-loop  system  under  control  (4),  we  introduce  the  fol¬ 
lowing  definitions  which  are  adapted  from  [16]  and  describe 
the  standing  conditions  on  sensing  and  communication. 

Definition  3.1:  Sensing/communication  matrix  sequence 
{S'(f)}  is  said  to  be  sequentially  lower-triangularly  complete 
if  it  is  sequentially  lower-triangular  and  in  every  row  i  of 
its  lower  triangular  canonical  form,  there  is  at  least  one 
j  <  i  such  that  the  corresponding  block  is  uniformly  non¬ 
vanishing. 


Definition  3.2:  Sensing/communication  matrix  sequence 
{S'(f)}  is  said  to  be  sequentially  complete  if  the  sequence 
contains  an  infinite  subsequence  that  is  sequentially  lower- 
triangularly  complete. 

Remark  3.1:  The  sequential  completeness  concept  of  the 
sensing/communication  matrix  sequence  {5'(f)}  was  first 
introduced  in  [16].  It  spells  out  the  least  restrictive  connec¬ 
tivity  condition  for  sensor/communication  network  in  order 
to  achieve  the  asymptotically  cooperative  stability  of  the 
overall  system.  It  is  equivalent  to  condition  of  the  existence 
of  a  spanning  tree  in  the  graph  theory  [14].  As  an  example, 
consider  the  following  communication  sequence. 


■  1 

0 

0  ■ 

■  1 

0 

1  ■ 

Sit3k)  = 

1 

1 

0 

,S{t3k+l)  — 

0 

1 

0 

0 

0 

1 

r  1  0  0  ■ 

0 

1 

0 

1 

S{t3k+2)  = 


0  1  0 
0  1  1 


(5) 


where  k  =  0, 1,---.  It  is  readily  verified  that  the  ma¬ 
trix  sequence  {^'(fgfe),  S'(f3fe+i),  >S'(f3fe+2)}  is  sequentially 
complete  since  the  sub-sequence  consisting  of  S{t3k)  and 
S{t3k+2)  is  sequentially  lower  triangular  complete.  O 


A.  Motivating  Example 

For  the  cooperative  control  of  multiagent  systems  (1),  if 
the  standard  design  of  Ui{t)  is  adopted  as  given  below 

n 

Mt)  ='^auisuitl)){xi{t)  -  xfit)),tG  (6) 


where 


(7) 


then  it  has  been  proved  in  [16]  that  the  sequential  com¬ 
pleteness  of  sensing/communication  matrix  sequence  {S'(f)} 
is  the  necessary  and  sufficient  condition  for  consensus  of 
multiagent  systems.  However,  under  the  discontinuous  co¬ 
operative  control  (4)  proposed  in  this  paper,  the  sequential 
completeness  of  sensing/communication  network  may  no 
longer  ensure  the  consensus  if  the  gains  an  are  simply 
designed  using  (7).  This  is  illustrated  through  the  following 
example. 

Example  1:  Suppose  we  have  3  agents.  Define  index  set 
H  =  {1,2,3},  V  H  :  xfifi)  =  ~ 

TLna-x.j  Xj(t)},  and  flmin  =  {f  €  H  :  Xi{t)  =  Xniin{t)  = 
min^  Xj{t)}. 

Assume  that  at  time  instant  to,  we  have  Hmin(fo)  =  {1}, 
and  fljnaxito)  =  {2j3}>  ^nd  the  sensing/communication 
topologies  among  three  agents  switch  according  to  sens¬ 
ing/communication  matrices  S{t3k),  S{t3k+i)  and  S'(f3fc+2) 
defined  in  (5). 

It  can  be  readily  verified  that  the  matrix  sequence 
S (tok) ,  S ,  S {t3k+2)  is  sequentially  complete.  How¬ 
ever,  the  consensus  is  not  guaranteed  if  the  standard  gain 
design  for  in  (7)  is  applied  under  control  (4).  One  possi¬ 
ble  scenario  is  that  according  to  the  sensing/communication 
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matrix  S{to),  agent  2  receives  information  from  agent  1  and 
may  converge  to  agent  1  in  finite  time  interval  ti  —  to, 
thus  at  time  instant  ti,  we  could  have  nmin(fr)  =  {li2} 
and  flmax(fr)  =  {3};  similarly,  according  to  S{ti),  agent 
1  receives  information  from  agent  3  and  may  converge  to 
agent  3  in  finite  time  t2  —  ti,  thus  we  may  have  nniin(i2)  = 
{2}  and  =  {1)3};  by  S{t2),  agent  3  receives 

information  from  agent  2  and  may  converge  to  agent  2  in 
finite  time  to  —  <2,  and  we  may  have  nmin(f3)  =  {2,3} 
and  Umax  (is)  =  {!}■  This  pattern  will  repeat  following  the 
periodical  sensing/communication  matrix  sequence  {S'(fi)}. 
In  other  words,  though  within  time  interval  [^0,^3),  the 
communication  topology  is  complete,  contraction  mapping 
is  not  established  since  we  have  a;niax(i^3)  =  a^max(fo)  and 
a;min(f3)  =  a;niin(fo)  from  the  above  analysis.  This  is  further 
illustrated  in  figure  1,  in  which  we  consider  three  agents 
with  controls  (4)  and  gain  aijit)  are  chosen  based  on  (7), 
simulation  parameters  are  given  as  tok+i—tok+i-i  =  0.1,  j  = 
1,  2,  fc  =  0, 1,  •  •  •,  and  initial  conditions  xi{to)  =  0,  X2{to)  — 
0,  and  X3{to)  =  0.1.  Apparently,  no  consensus  is  reached. 
O 


Fig.  1.  System  responses 


In  the  presence  of  sensing/communication  delays,  the 
cooperative  control  in  (4)  becomes 

n 

Ui{t)  =  '^aii{sii,xi{tl  -  Tii))sgn{xi{t  -  Til)  -  Xi{t)), 
1=1 

^GK)^fc+r))  (3) 

where  tu  G  [0,  r]  are  time  delays  incurred  during  transmis¬ 
sion  with  r  being  the  upper  bound  on  latencies  of  information 
transmission  over  the  network.  In  general,  multiagent  sys¬ 
tems  with  time-delays  become  more  involved.  By  imposing 
more  stringent  network  connectivity  conditions,  such  as  bi¬ 
directional  (undirected)  sensing/communication,  the  consen¬ 
sus  may  still  be  ensured.  However,  given  the  discontinuous 
cooperative  control  (8),  if  control  gains  aij  are  simply  chosen 
according  to  (7),  consensus  cannot  be  guaranteed  even  under 
fixed  and  undirected  communication  topology  as  illustrated 
by  the  following  example.  Nonlinear  piecewise  constant  gain 
aij{-)  needs  to  be  designed  to  solve  the  problem. 


B.  Design  and  Stability  Analysis  with  Directed  and  Switch¬ 
ing  Topologies 

As  shown  in  example  1  ,  standard  network  topology  based 
control  gain  design  for  (4)  no  longer  implies  the  consensus 
of  multiagent  systems,  even  with  the  most-restrictive  net¬ 
work  connectivity  condition  (that  is,  fixed  and  undirected 
communication).  In  this  subsection,  in  order  to  ensure  the 
multiagent  systems  consensus  with  control  (4)  under  the  least 
restrictive  sensing/communication  condition  (that  is,  sequen¬ 
tial  completeness  of  {>S'(f^}),  we  propose  a  new  nonlinear 
piecewise  gain  design.  The  convergence  of  the  overall  closed- 
loop  systems  is  proved  by  developing  a  contraction  mapping 
method  for  multiagent  systems. 

Theorem  1:  Consider  the  multiagent  system  (1)  under 
cooperative  control  (4).  Assume  that  sensing/communication 
matrix  sequence  {*S'(ffc)}  is  uniformly  sequentially  com¬ 
plete*.  Let  the  nonlinear  gain  aij  be  designed  as  follows: 
for  any  agent  I, 

1)  if  xi{tl)  =  ma,Xj(.j^^Xj{tl)  =  minj Xj{tl),  then 
aij{tl)  can  be  any  bounded  positive  value. 

2)  if  Xi{tl.)  >  maXjgTV'i  tXj{t\),  let  aij{t\)  be  selected  to 
satisfy  the  inequality 

0  <  L  (9) 


3)  if  xi{t%)  <  minjgyv'i  Xj{t%),  let  aij{t%)  be  selected  to 
satisfy  the  inequality 

0  <  y:  nM)  <  ~  (10) 


4)  if  unnJ(zJ^J■^Xj{tl)  <  xi{tl)  <  Xj{tl),  let 

aij{t\)  be  selected  to  satisfy  the  inequality 


0  <  XI  <  min 


maXjgAr, 

Ct 

Ct 


(11) 


Then  consensus  of  system  (1)  is  asymptotically  achieved  in 
the  sense  of  (2). 

Proof:  See  Appendix. 

The  nonlinear  gain  design  conditions  (9)  to  (11)  play  a 
paramount  important  role  for  the  consensus  of  multiagent 
systems  (1).  Those  conditions  are  easy  to  be  satisfied  since 
for  agent  I,  it  only  requires  the  available  neighboring  state 
information  of  agent  I  in  the  design  of  aij{tl).  For  instance, 
to  satisfy  (9),  one  simple  choice  could  be 


o^ijitk) 


xi{t%)  -  miUj-gAA,  Xj{tl) 

(|A/'d  +  l)ct 


^IgMi 


(12) 


where  |A//|  denotes  the  cardinality  of  the  set  Ni-  Same 
selection  can  be  made  for  satisfying  the  conditions  (10)  and 
(11). 


*The  time-varying  sensing/communication  topology  is  considered  here. 
If  the  topology  becomes  fixed  after  certain  time,  we  can  treat  it  as  a  special 
case  of  switching  sensing/communication  sequence  S{t^)  with  ct  being  any 
positive  constants. 
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V.  Conclusions 


C.  Multiagent  Systems  with  Time-Delays 


The  following  theorem  presents  the  control  design  and 
consensus  analysis  for  multiagent  systems  with  directed 
switching  communications  and  time-delays. 

Theorem  2:  Consider  the  multiagent  system  (1)  under 
cooperative  control  (8).  Assume  that  sensing/communication 
matrix  sequence  of  {S'(f|)}  is  sequentially  complete.  Let  the 
nonlinear  gain  aij  be  designed  as  follows:  for  any  agent  I, 

1)  if  xi{tl)  =  Xj{t%  -  n,)  =  min^-gA/-,  Xj{tl  - 

Tij),  then  can  be  any  bounded  positive  values. 

2)  if  xi{tf.)  >  maxjgTV'i  —  let  be  selected 

to  satisfy  the  inequality 


3) 


0  <  ^  < 


miiijeM  Xjjtj 
maxjct,  r} 


T-o) 


(13) 


if  xi{tl)  <  min^gA/'i  let  be  selected 

to  satisfy  the  inequality 


4) 


(14) 

if  minjgA/-,  Xj{tl  -  Tij)  <  xi{tl)  <  maxjgAT,  Xj{tl  - 
Tij),  let  be  selected  to  satisfy  the  inequality 


(maXjgAf,  Xj(tl-Tij)-x,{tl)  \ 
max{ct,r}  / 

(15) 

Then  consensus  of  system  (1)  is  asymptotically  achieved  in 
the  sense  of  (2). 

Proof:  The  proof  can  be  done  following  the  similar  proce¬ 
dure  as  shown  in  theorem  1,  and  omitted  here  due  to  space 
limitation. 


IV.  Example 

Let  us  reconsider  example  1  for  the  consensus  of  three 
agents  with  control  (4)  under  the  sensing/communication 
topologies  S{t3k),S{t3k+i)  and  S{t3k-\-2)  given  in  (5).  Un¬ 
der  the  same  simulation  conditions,  system  responses  are 
shown  in  2,  and  consensus  is  reached. 


Fig.  2.  System  responses 


In  the  paper,  we  proposed  a  new  discontinuous  cooperative 
control  for  consensus  of  multiagent  systems  with  directed 
and  switching  topologies  and  sensing/communication  delays. 
The  proposed  new  design  may  be  applied  to  address  the 
cooperative  control  problem  for  truly  nonlinear  systems. 
Further  research  will  be  devoted  to  convergence  speed  and 
performance  analysis  of  the  proposed  cooperative  control. 

Appendix 

Proof  of  theorem  7:  By  substituting  (4)  into  (1),  we  have 

n 

Xt  =  '^aii{sii{tl),xi{tl))sgn{xi{t)  -  Xi{t))  =  T^{x), 
1=1 

(16) 

where  7  =  1,  •  •  • ,  n  and  x  =  [xi,X2,  •  •  • ,  XnY ■ 

We  first  show  that  if  matrix  sequence  {S'(f^)}  is  uniformly 
sequentially  complete,  then  x  =  a;*l  is  the  only  type  of 
equilibrium  point  of  the  closed-loop  system  (16)  where  1  G 
3?"  a  vector  with  all  entries  being  1.  The  proof  is  established 
by  contradiction.  Assume  that  x®  =  [xf,---,x®]^  is  an 
equilibrium  point  satisfying  J^i(x^)  =  0,Vi  and  min^  x®  f 
maxi  X®.  Define  index  sets  4)i„in  =  {j  :  x|  =  min^  x®}  and 

4’max  =  {j  ■  =  maxi  X®}. 

Since  matrix  sequence  is  uniformly  sequentially  complete, 
which  is  equivalent  to  say  that  the  composite  graph  S{f)  has 
at  least  one  globally  reachable  node  Xg.  Apparently,  Xg  may 
be  in  or  4)„iax  or  may  not  be  in  both  sets.  For  any 

case,  there  must  exist  an  index  j  in  the  compliment  set  of 
the  set  containing  Xg  while  maintaining  a  path  to  Xg  due 
to  the  completeness  assumption.  That  is,  Xj  f  Xg,  which 
renders  Fi{x)  f  0  for  at  least  one  i,  a  contradiction. 

In  what  follows,  we  further  show  that  the  system  (16)  is 
Lyapunov  stable,  and  cooperative  stable  (consensus  can  be 
achieved). 

(a)  At  each  time  instant  f,  let  i*  denote  the  index  such  that 

Xi*{t)  =  max  Xj(t)  (17) 

j 

we  will  show  that  Xi*  (t)  is  non-increasing  over  time.  It 
follows  from  (17)  that  sgn(xi(f)  —  Xi*  (f))  <  0  for  all  f 
0.  Hence  Xi-{f)  <  0,  and  we  conclude  that  the  maximum 
value  of  maxjXj(f)  never  increases  over  time.  The  proof 
of  the  minimum  value  of  minj  Xj  (t)  never  decreasing  over 
time  is  similar.  Lyapunov  stability  becomes  obvious  from  the 
above  conclusions. 

(b)  To  prove  consensus,  we  will  show  that  the  mapping 
defined  by  differential  equation  (16)  is  a  contraction  mapping 
under  undirected  sequentially  complete  network  topologies. 
That  is,  we  will  prove  that  for  any  t,  there  exists  a  constant 
d(t)  >  0,  such  that 

max  II Xi (f -I- (5)  — Xj (7-1-5)11  <  Amax  ||xi(7)  —  Xj(7)||,  (18) 

hi 

where  0  <  A  <  1. 

Let  U  =  {1,  •  •  • ,  n}  be  the  set  of  indices  on  state  variables, 
and  at  time  t,  let  x„iax(7)  =  maxjXj(7)  and  Xniin(7)  = 
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minjXjit).  Define  sub-sets  f^max(0  = 

f^/^^max(t),  and  =  fl/^niin{t)  as  follows: 

f^max(i)  =  {i*  G  :  x,*{t)  =  Xmax(t)} 

and 

^min(^)  —  G  ^2  :  Xi^(t)  =  Xmin(f)} 

To  show  (18),  it  is  equivalent  to  prove  for  any  t,  there  exists 
a  constant  6{t)  >  0,  such  that 

II  a^max(f  “t“  a^min  (f  “t“  II  ^  A||Xmax(f)  a^min(f)||5  (19) 


It  is  readily  seen  from  (21)  that  before  network  topology 
at  time  instant  t  switches  to  another  topology,  the  value  of 
Xi,  (t  +  t)  for  0  <  T  <  5  will  increase  within  the  given  time 
interval. 

On  the  other  hand,  consider  the  agent  I  G 
with  decreasing  speed  at  time  instant  t,  and  has  one  of  its 
neighboring  agents  from  ilminit),  that  is,  we  have 

=  X!  “  2;/(f))  <  0, 

t&KA+i)  (22) 


for  some  0  <  A  <  1. 

It  follows  from  points  (a)  and  (b)  that  for  any  time  interval 
5{t),  we  have 

^max  {t  +  6)< 

^max(^)j  ^min  (f  +  5)  > 

Xmin  (f),  (20) 

thus,  a  weaker  version  of  inequality  (19)  holds  for  some 
0  <  Ai  <  1  under  arbitrary  network  conditions  and  system 
dynamics  constraints. 

Now  let  us  show  that  if  for  any  t,  there  exists  a  hnite  value 
6{t)  such  that  the  composite  network  topology  is  complete 
within  the  time  interval  [t,t  +  (5),  then  (19)  always  holds. 

Consider  the  evolution  of  Xi»{t)  for  every  i*  G  flmaxCf) 
and  Xi^{t)  for  every  G  nniin(f)-  Several  cases  are  in  the 
sequel. 

Case  I:  If  there  are  leader  nodes^  staying  in  flniax(iO  for 
t'  G  [t,t  +  6),  which  means  they  don’t  directly  or  indirectly 
receive  any  information  from  members  in  and 

hence  Xmax{t  +  S)  =  Xma^lt).  Thus  all  the  agents  in  Omin(f) 
must  have  either  direct  or  indirect  (through  agent  in  flmin(i)) 
information  exchange  with  members  in  their  complement 
set  during  the  time  interval  [t,t  +  S{t)),  otherwise, 

it  contradicts  the  completeness  assumption  for  composite 
network  topology  within  the  time  interval  [t,t  +  6 (t)).  The 
question  then  becomes  to  verify  Xynin{t  +  <))  >  Xniin(i)  in 
order  to  prove  that  (19)  holds. 

Now  consider  the  evolution  of  Xi^(t)  and  xiit)  for  I  G 
f^min(f)'  According  to  point  (a),  the  states  Xi^{t)  have  the 
tendency  of  increase,  and  xi{t)  have  the  tendency  of  either 
increase  or  decrease.  If  we  can  show  that  all  Xi^  (t)  will 
increase  in  [t,t  +  S{t)),  and  the  decreasing  agents  xi{t) 
will  not  reduce  their  values  to  some  Xi^  at  time  instant  t, 
that  is,  Xi^{t),  then  a;min(f  +  <))  >  a;min(f)  is  apparent. 
Since  by  completeness  assumption,  every  agent  i*  G  flmin(f) 
will  have  a  chance  to  communicate  with  some  agents  in 
f^min(f)’  without  loss  of  generality,  we  consider  that  agent 
i*  G  ^lrain{t)  has  the  communication  with  agents  in 
right  at  time  instant  t,  then  we  have 

Xj{t)  >  Xi,{t),yj  G  M.  n  fimin(f) 

and  thus 

X!  “Gi(f)>0-  (21) 

node  is  called  a  leader  node  if  it  does  not  receive  information  from 
other  nodes  or  only  has  communication  with  nodes  in  its  current  set. 


It  follows  that 


and 


-  X!  <  0  (23) 

jeM 

xiit  +  t)>  xi(t)  -  ^  aij{tl)T  (24) 


Note  also  that  since  xi  <  0,  agent  I  must  satisfy 

xi{tl)  >  maxjgAT,  or  Xj{tl)  <  xi{tl)  < 

maxjg^j  Xj{tl).  To  this  end,  due  to  the  nonlinear  gain 
selected  in  (9)  or  (11),  we  have 

jeMi  jeMi 

^  xiitj)  -  Xjjtl) 

Ct 

Note  that  minjgTv'i  Xj{tl  —  Tij)  >  Xniin(f)  ,  we  further  have 

„  us\  ^  2;;  (f^)  —  Xinin(i) 

/  .  ^  - 

i  6  Vi  * 

together  with  (24),  we  have 


(25) 


(26) 


xi{t  +  T)  >  + 


Xr] 


.it) 


Ct 


(27) 


Therefore,  we  know  that  before  network  topology  at  time 
instant  t  switches  to  another  topology,  the  value  of  xi  {t  +  t) 
for  some  0  <  t  <  (5  will  keep  decreasing  but  not  achieving 
to  the  minimum  value  at  time  instant  t  (that  is,  Xmin(f))  in 
any  given  time  interval  smaller  than  ct .  In  conclusion,  for  all 
agents  i*  G  nmin(i),  their  values  will  increase  during  time 
interval  [t,  t+6)',  for  the  decreasing  agents  xi{t),  I  G 
their  values  will  not  be  able  to  reduce  to  Xniin(f)-  Hence,  we 

have  Xmin(f  T  <))  V  2^min(f)- 

Case  II:  Similar  argument  is  true  for  the  case  of  leader 
nodes  staying  in  Hmin(f)  in  time  interval  [t,  t  +  S).  That  is, 
for  all  agents  i*  G  Hniax(f),  their  values  will  decrease  during 
time  interval  [t,t  +  6)', 

For  agent  I  G  H^ax(f)  which  acquires  the  increasing  speed 
at  time  instant  t,  we  have 

=  Y  '^yitD^S^ixjit)  -  Xi{t))  >  0, 

ty[tltl+i)  (28) 
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It  follows  that 


(29) 


and  x^ia{t+S)  >  x^in{t)-  This  completes  the  proof. 

□ 


0  <  i;(f)  < 

jeMi 

and 

xi{t  +  T)  <  xi{t)  +  ^  aij{tk)T  (30) 

Note  also  that  since  xi  >  0,  agent  I  must  satisfy 

xi{tl)  <  mmje^r,Xj{tl)  or  mmj^j^^Xjitl)  <  xi{tl)  < 
maxjgTvi  Thus,  according  to  gain  selection  dij{t)  in 

(10)  or  (11),  we  have 

jeMi  jeMi 

Ct 

which  further  leads  to 

^  (32) 

ieA/'i 

and 

\  ^  /4-\  I  aJmax(f)  ^  /'4-\ 

Xl{t  +  t)  <  Xl{t)  H - z - —T  <  Xinax(f), 

Ct 

since  max^g^/'i  Xj(tl)  <  x^nxit).  Thus  for  a:i„in(i  +  <))  = 

^min(f)?  we  have  Xniax(f  “t“  <  2^max(^)- 

In  summary,  for  both  Case  I  and  Case  II,  inequality  (19) 
holds. 

Case  III:  Now  we  consider  the  case  in  which  there  are  no 
leader  nodes  in  llmax(i)  and  ni„in(f).  Then,  for  every  i*  e 
nmax(i),  we  must  have  Xi-{t  +  6)  <  Xi*{t),  because  system 
Xi*  (t)  must  have  state  exchange  with  at  least  one  element  in 
their  complement  sets  during  time  interval  [t,t  +  6).  Same 
argument  holds  for  Xi^  (t),  and  we  have  Xi_^  {t  +  6)  >  Xi^  (t). 
Several  sub-cases  follow: 

Case  III-l:  If  flmax(t)  n  flmax(t  +  5)  ^  %  and  flmin(f)  n 
+  iJ)  7^  0,  which  means  at  least  one  i*  €  flmaxCf) 
remains  staying  in  Ct^g^^{t+S),  and  at  least  one  z*  G  llmin(f) 
remains  staying  in  nmin(i  +  thus  (19)  holds. 

Case  III-2:  llmax(t)  n  flmax(f  +  6)  ^  9  and  llmin(f)  n 
^min{t  +  S)  =  0.  It  follows  from  flmaxCO  H  flniax(f  +  <5)  ^  0 
that  a;max(f  -f  5)  <  x^na^it)-  On  the  other  hand,  we  have 
a:min(f  +  S)  >  x^iy^{t)  from  point  (b).  Thus,  (19)  holds. 

Case  111-3:  ilmaxit)  C  -f  5)  =  0  and  ilminit)  C 

Omin(t  +  <5)  7^  0.  It  follows  from  Qmir,(t)  n  Ominff -|-(5)  ^  0 
that  a;min(f  S)  >  a;niin(f)-  It  then  suffices  to  show  that 
Xmaxit  S)  <  Xraaxif),  which  is  always  true  from  point  (a). 

Case  ni-4:  Omax(i)  n  Omax(f  -I-  5)  =  0  and  ilminit)  n 
Omin(f  +  <^)  =  0-  This  means  that  the  entries  in  0„iax(i  +  <5) 
and  Oi„in(f  -f  S)  are  completely  from  the  complement  sets 
I^max(i)  and  respectively. 

Due  to  the  undirected  network  topology,  and  following 
the  same  argument  in  case  I,  we  know  that  for  any  entry 
of  f2^ax(0’  its  maximum  increase  in  S{t)  can  only  reach  to 
some  value  less  than  Xniax(f),  and  for  any  entry  of 
and  its  maximum  decrease  in  S{t)  can  only  reach  to  some 
value  greater  than  a;niin(f)-  Thus,  we  have  Xraax{t  +  5)  < 
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ABSTRACT 

In  this  paper,  we  propose  an  approximate  policy  itera¬ 
tion  method  for  cooperative  control  of  multiagent  systems 
under  the  limited  sensing/communication  topology.  By 
considering  a  class  of  nonlinear  multiagent  systems,  the 
cooperative  control  problem  is  formulated  as  making  all 
systems  achieve  consensus  while  minimizing  the  individ¬ 
ual  sensing/communication  topology  dependent  cost  func¬ 
tions.  To  solve  the  induced  multiagent  Hamilton-Jacobi- 
Bellman  (HJB)  equations,  a  multiagent  policy  iteration  al¬ 
gorithm  is  proposed  with  convergence  proof.  Neural  net¬ 
work  parameterization  is  further  employed  to  approximate 
value  function  to  deal  with  unknown  system  dynamics. 
Through  seeking  the  least-squares  solution  based  on  the 
measured  online  sensing/communication  data,  the  approx¬ 
imate  multiagent  policy  iteration  algorithm  is  obtained  to 
solve  the  posed  optimal  cooperative  control  problem  for 
multi  agents.  Simulation  results  illustrate  the  effectiveness 
of  the  proposed  optimal  cooperative  control. 

KEY  WORDS 

Multiagent  policy  iteration.  Cooperative  Control,  Multia¬ 
gent  Systems,  Neural  Network. 

1  Introduction 

Cooperative  control  of  multiagent  systems,  in  particular 
consensus  control  of  multiagent  systems,  has  been  one 
of  the  dominating  research  subjects  in  the  current  control 
community  due  to  numerous  potential  applications  in  the 
areas  such  as  robotic  network  [11][14][15],  power  network 
[20],  to  name  but  a  few.  The  research  for  cooperative  con¬ 
trol  has  been  focused  on  two  types  of  major  issues:  the 
necessary  and  sufficient  multiagent  network  connectivity 
condition  for  information  exchange  [6][13],  and  the  design 
of  locally  distributed  cooperative  control.  While  fruitful  re¬ 
sults  for  cooperative  control  design  have  been  obtained  for 
first-order  linear  systems  [8]  [16],  for  second-order  linear 
systems  [21],  for  high-order  linear  systems  [13],  and  for 
nonlinear  systems  [10]  [9]  [23],  few  results  are  available  for 


optimal  cooperative  control  design.  There  appeared  some 
recent  work  in  the  study  of  optimal  cooperative  control, 
such  as  those  in  [19]  [2]  [4]  [12].  Nonetheless,  it  is  still  a 
challenge  issue  to  systematically  address  the  optimal  coop¬ 
erative  control  problem  for  more  general  nonlinear  multia¬ 
gent  systems,  particularly,  in  the  presence  of  model  uncer¬ 
tainties.  In  this  paper,  we  develop  an  approximately  adap¬ 
tive  multiagent  policy  iteration  (MPI)  algorithm  to  coop¬ 
eratively  solve  the  consensus  problem  for  multiagent  sys¬ 
tems. 

The  result  reported  in  this  paper  aims  to  present  a 
dynamic  programming  solution  to  multiagent  cooperative 
control.  Eor  multiagent  optimal  cooperative  control,  the 
key  issue  is  how  to  establish  an  optimality  equation  and 
find  its  solution  in  real  time.  We  tackle  this  problem  by 
considering  a  general  class  of  feedback  linearizable  nonlin¬ 
ear  multiagent  systems.  We  assume  that  there  exist  admis¬ 
sible  cooperative  controls  for  such  kind  of  multiagent  sys¬ 
tems  under  the  complete  sensing/communication  condition 
[13].  Since  this  paper  is  centered  on  the  design  of  approxi¬ 
mately  adaptive  optimal  cooperative  control,  the  fixed  sens¬ 
ing/communication  topology  is  imposed  for  ease  of  design. 
The  case  for  more  complicated  time-varying  sensing  com¬ 
munication  topology  will  be  treated  in  future  work.  The 
optimal  cooperative  control  problem  is  then  formulated  as 
making  all  systems  achieve  consensus  while  minimizing 
the  individual  sensing/communication  topology  dependent 
cost  functions.  It  is  shown  that  the  optimal  solution  to  the 
defined  problem  requires  to  solve  a  multiagent  Hamilton- 
Jacobi-Bellman  (HJB)  equation.  To  avoid  the  obstacles  in 
analytically  solving  multiagent  HJB  equation,  we  extend 
the  online  policy  iteration  approach  in  [18] [22]  to  the  mul¬ 
tiagent  case,  and  employ  RBE  neural  networks  to  approx¬ 
imate  value  functions  at  each  iteration.  Through  seeking 
the  least-squares  solution  to  estimate  the  optimal  neural 
weights,  a  new  approximately  adaptive  multiagent  policy 
iteration  algorithm  is  proposed.  It  is  further  shown  that 
the  proposed  adaptive  optimal  cooperative  control  approx¬ 
imately  solves  the  posed  optimal  consensus  problem.  Sim¬ 
ulation  results  are  provided  to  illustrate  the  effectiveness  of 
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the  proposed  optimal  design. 

2  Problem  Formulation 

Consider  a  multiagent  system  which  has  N  members  and 
each  agent  assumes  the  general  nonlinear  dynamics 


iz  =  fi{xi)  +  gi{xi)u„  (1) 


where  i  e  H  =  {I,-''  Xi{t)  €  5ft"  is  the  system 
state,  Ui  G  Jft™  is  the  control  input  to  be  designed,  fi,gi  : 
5ft"  I— >  5ft"  are  locally  Lipschitz  continuous  functions. 

The  objective  of  this  paper  is  to  design  an  optimal 
cooperative  control  Ui{t)  to  achieve  the  consensus  of  the 
multiagent  system  (1)  such  that 

lim  Xi{t)  =  X*,  Vi,  (2) 

t—¥00 

while  minimizing  the  following  individual  cost  function  for 
each  agent  i, 


J i  (ttj ,  Xi  (to)  5  ^ij Xj  (^o)  ) 


/.OO  /  W  \ 

I  I  ^  ^ (^Xj  Xj^  SijQij(^Xi  Xj')  Uj  RiUi  1 
Jto  \j^^  ) 


where  x*  is  some  constant  denoting  the  consensus  value, 
Qij  and  Ri  are  symmetric  and  positive  definite  matrices, 
and  Sij  is  a  binary  number  describing  the  availability  of  the 
sensing/communication  information  exchange  between  the 
agent  i  and  the  agent  j. 

The  success  of  solving  consensus  problem  defined  in 
(2)  is  dependent  on  the  sensing/communication  informa¬ 
tion  exchange  among  agents,  which  can  be  described  by  a 
N  X  N  sensing/communication  matrix  defined  below 


Sll 

Sl2  •  • 

Sin 

S21 

S22 

S2N 

sni 

SN2  •  ■ 

snn 

where  su  =  1;  Sij{t)  =  1  if  the  ith  agent  can  receive 
the  information  from  the  jth  agent,  and  Sij  =  0  if  other¬ 
wise.  In  general,  sensing/communication  matrix  S  could 
be  time-varying  due  to  the  changing  environment.  In  this 
paper,  we  focus  on  the  design  of  approximately  adaptive 
optimal  cooperative  control  for  multiagent  systems  under 
the  assumption  of  the  sensing/communication  matrix  S  be¬ 
ing  constant  and  satisfying  the  completeness  condition  for 
its  connectivity.  The  completeness  condition  for  network 
connectivity  was  developed  in  [13],  which  can  be  summa¬ 
rized  into  the  following  definition. 


Definition  2.1  Sensing/communication  matrix  S  is  said  to 
be  complete  if  in  every  block  row  i  of  its  lower  triangular 
canonical  form,  there  is  at  least  one  j  <  i  such  that  the 
corresponding  block  is  nonzero. 


For  more  general  sequentially  changing  sens¬ 
ing/communication  topology,  the  sequential  completeness 
concept  of  the  sensing/communication  matrix  sequence 
{S'(f)}  was  also  introduced  in  [13].  It  is  equivalent  to  the 
condition  of  that  there  exists  a  spanning  tree  in  the  commu¬ 
nication  graph  [11],  which  represents  the  least  restrictive 
connectivity  condition  for  sensor/communication  network 
in  order  to  achieve  the  asymptotically  cooperative  consen¬ 
sus  of  the  overall  multiagent  system.  In  this  paper,  the  fixed 
S  is  considered,  the  completeness  condition  is  described  in 
definition  1 .  We  will  utilize  the  following  assumptions  for 
the  design  of  optimal  cooperative  control. 

Assumption  2.1  The  sensing/communication  matrix  S  in 
(4)  is  complete. 

Assumption  2.2  For  nonlinear  multiagent  systems  (1), 
there  exist  admissible  cooperative  control  policies  Ui  (t)  to 
solve  the  consensus  problem  defined  in  (2). 

To  this  end,  the  optimal  cooperative  control  problem 
can  be  formulated;  given  the  nonlinear  multiagent  systems 
(1),  the  set  of  admissible  cooperative  control  policies,  and 
the  infinite  horizon  cost  function  (3)  for  individual  agents, 
find  an  admissible  cooperative  control  policy  such  that  the 
cost  function  (3)  achieves  its  minimum. 

3  The  Proposed  Approximate  Policy  Itera¬ 
tion  for  Multiagent  Cooperative  Control 

3.1  Multiagent  HJB  Equation 

Recall  that  the  cost  function  for  agent  i  is  defined  in  (3), 
which  may  be  rewritten  as 

J i  {tii  ,  Xi  (f  o)  J  ^ijXj  (to)  ) 

(  (,Xi  Xj)  Qij(,Xi  Xj)  “t“  Ui  RiUi  1  dt,(5) 

je^r^  J 

where  A/)  =  {j  G  n|sij  f  0}  denotes  the  neighbor  set  of 
agent  i.  The  following  lemma  is  instrumental  in  developing 
the  multiagent  Hamilton-Jacobi-Bellman  (HJB)  equation. 

Lemma  3.1  For  admissible  cooperative  control  ut  (t),  if 
there  exists  a  positive  definite  continuously  differentiable 
function  Vfixi,  SijXj]Ui)  satisfying  the  following  property 

dV^ 

-w^ifiixi)  +  gz{xi)ui) 

UXi 

+  -g)^(M^3)+9zixj)uj) 

j&Mj  ^ 

T  ^  )  )Xj  Xi)  Qij)Xj  Xi)  “t“  Ui  RiUi  —  0(6) 

jeW 

and  the  boundary  condition  Vi{xi{oo),  SijXj{oo);  Ui)  =0, 
then  Vi(xi,  SijXj]Ui)  is  the  value  function  for  system  (1) 
for  all  t,  and 

Vii^Xiito) ,  SijXj  (Iq)  ,  Ui)  —  Ji{Ui,  Xiiff) ,  SijX  jiff))  (7) 
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To  this  end,  it  follows  from  lemma  3.1  and  Bell¬ 
man’s  principle  of  optimality  [3],  we  know  that  the  optimal 
value  function  V* SijXj{t))  approximately  satisfies 
for  small  A  — >  0 

V*{x^{t),SijXj(t))  (8) 

~  T[im[l{xi{t),Xj{t),Ui)A  +  V*{xi{t  +  A),SijXj{t  +  A)], 

Ui 

where  lixiit^jXjit^^Ui)  — 

Xj)  +ujRiUt,  Xi{t  +  A)  ~  Xi(t)  +  {fi{xt)  +  gi{xi)ut)A, 
anda;j(f  +  A)  Xj{t)  +  {fi{xj)  +  gi{xj)uj)A.  Since  F/ 
is  continuously  differentiable,  we  may  write  (for  A  0) 


with  the  associated  boundary  condition  V*  (x* ,  Sijx* )  =  0, 
which  requires  that  the  optimal  value  must  be  null  when 
evaluated  on  an  extremal  trajectory  (all  agents  in  the  set 
reach  consensus.) 

Equation  (14)  is  the  mutliagent  HJB  equation.  Its 
solution  would  provide  the  optimal  cooperative  control  in 
(13).  However,  it  is  difficult  to  solve  mainly  for  two  rea¬ 
sons.  First,  equation  (14)  is  a  nonlinear  partial  differen¬ 
tial  equation,  and  it  is  in  general  impossible  to  solve  this 
equation  in  analytic  form.  Second,  the  coupling  terms 

“air  ifi{xj)+gi{xj)uj)  cause  extra  difficulty  due 
to  involvement  of  Uj  which  may  require  information  prop¬ 
agation  from  agents  not  in  the  neighboring  set  A/). 


V* {xiit  +  A),SijXj(t  +  A))  ~  V*{xi{t),SijXj{1:)) 

dV*'^ 

+  {xi (t) ,  Xj  (t))[f^ (x,)  +  gi{x,)ui]A 

_ ^  QY*X' 

+  X!  +  9i{Xj)Uj]Ai9) 

Substituting  (9)  in  (8)  we  obtain  the  multiagent  HJB  equa¬ 
tion 

Q  =  T[\mHi{xi,SijXj,Ui,V*)  (10) 

Ui 

where  the  multiagent  Hamiltonian  is  defined  as 


3.2  Multiagent  Policy  Iteration  Algorithm 

Motivated  by  the  policy  iteration  algorithm  for  solving  HJB 
equation  for  single  agent  systems  in  [18],  in  what  follows, 
we  provide  the  mutliagent  policy  iteration  algorithm  for  ap¬ 
proximately  solving  the  multiagent  HJB  equation  (14).  The 
proposed  multiagent  policy  iteration  algorithm  consists  of 
the  following  two  steps: 

Step  1:  Policy  evaluation.  Find  an  admissible  coop¬ 
erative  control  policy  Ui^o{xi,  SijXj).  For  any  inte¬ 
ger  I  >  0  denoting  the  iteration  index,  solve  for 

(yXi ,  ^ijXj ,  i )  using 


R-i  {.Xi ,  Sij  Xj  ,  Ui ,  V)  ) 

—  ^  ^  (.Xj  Xi'^  Qij(^Xj  Xi^  “t“  Ui  RiUi 

j&Mi 

dV^ 

+  ^^(.fi(.Xi)  +gi{xi)ui) 

OXi 

QyT 

+  +  9^{xj)uj)  (11) 

jeMi  ^ 

The  minimum  with  respect  to  Ui  is  obtained  by  solving 

dHiiXi.SiiXj  ,Ui,V*)  r\  ..A.  ' 

—  ^  ^  ^  ^  ^  =  0,  that  IS, 


dui 


2ujRi  -f  -^^9i{xi)  =  0 


(12) 


0  —  ^  ^  (.Xj  Xi)  Qij(Xj  Xi)  “t“  Ui  jRiUij 

jeM 

dV^i 

+-^(Mxi)  +  gi(xi)u,j) 

+  ^  -^(fi(xi)+gt(xj)uj),  (15) 

ieM  ^ 

'f3\thVij(x* ,  SijX*)  =  0. 

Step  2:  Policy  improvement.  Update  the  control  policy 
by 

=  (16) 


yielding  the  optimal  cooperative  control 


2^*  dx. 


(13) 


Substituting  (13)  into  (10)  yields 


qY*t 

0  —  ^  ^  (Xj  Xi)  Qij(Xj  Xi)  TT-  fi(Xi) 

IdV*^  .  ,  _i  ,  .^dV* 

4  dx,  dx. 


-E 

jeA/'i 


dV: 


*T 


dxi 


-(Mxj)  +  g^(xJ)uj), 


(14) 


The  convergence  of  the  multiagent  policy  iteration  al¬ 
gorithm  given  in  (15)  and  (16)  is  summarized  into  the  fol¬ 
lowing  theorem. 

Theorem  1  If  a  sequence  of  pairs  {Vij,Uijj.i}  is  gener¬ 
ated  by  (15)  and  (16),  then  the  corresponding  value  func¬ 
tion  Vij  satisfying 


<  V,j  (17) 

and 

lim  =  u;  (18) 

l—yoo 
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The  proof  of  theorem  1  can  be  done  following  the 
similar  lines  of  reasoning  as  that  of  theorem  4  in  [18].  De¬ 
tails  are  omitted  due  to  space  limitation.  The  proposed  mul¬ 
tiagent  policy  iteration  algorithm  relieves  the  nonlinearity 
obstacle  in  solving  multiagent  HJB  equation  (14)  to  a  cer¬ 
tain  level  in  the  sense  of  only  dealing  with  a  linear  partial 
differential  equation.  For  instance,  by  linearly  parameter- 
ayT 

izing  the  solution  to  (15)  can  be  obtained  from  a  set 
of  linear  algebra  equations.  However,  Uj^i  for  j  €  Ni  are 
still  needed  in  solving  (15),  which  might  be  hard  to  directly 
be  sensed  and/or  communicated. 

To  avoid  this  obstacle,  we  note  that  the  solution  Vi^i 
to  (15)  is  actually  the  value  function  for  system  (1)  at  the 
iteration  I,  since  it  satisfies  the  properties  in  lemma  3.1. 
Thus,  we  obtain 

(/)  )  (/)  5  Xi{t^  ^  SijX  . 

It  follows  from  the  above  equation  and  (3)  that 

{Xi  (/)  5  ^ij Xj  (/) ,  lli^l  (/)  ) 


+Vi^i{xi{t  +  T),SijXj{t  +  T);  +  T)),  (19) 

where  T  >  0  represents  certain  time  interval.  To  this 
end,  the  policy  evaluation  step  in  the  proposed  multia¬ 
gent  policy  iteration  algorithm  can  be  replaced  by  equation 
(19)  for  solving  for  Vij  based  on  the  available  information 
Xi{t),  Xj{t)  and  during  the  time  interval  [/,  t  +  T], 

3.3  Approximate  Policy  Iteration 

A  significant  advantage  of  the  proposed  multiagent  policy 
iteration  algorithm  is  that  it  iteratively  generates  a  sequence 
of  pairs  through  (19)  and  (16)  by  only  us¬ 

ing  the  available  local  information  Xi,  xj  and  Ui  for  agent 
i,  which  monotonically  converges  to  the  optimal  value  V* 
and  u*.  It  is  apparent  that  the  key  is  to  solve  for  /  from 
(19).  To  facilitate  the  design  and  for  the  ease  of  implemen¬ 
tation,  in  the  sequel,  we  hypothesize  that  Vi^i  has  a  linearly 
parameterized  form  as 

m 

Vi,l{Xi,  SijXj)  =  ^ 

fi=l 

(20) 

where  =  [(t)iuh4'i2,h- ' '  ^ 

are  some  known  basis  functions,  and  9*  i  = 

[9ii,i,  012,1,  ^  unknown  constant 

parameters  to  be  estimated. 

It  is  worth  pointing  out  that  the  value  functions  F] ; 
are  generally  nonlinear  and  may  not  assume  the  exact  para¬ 
metric  form  as  that  in  (20).  In  that  sense,  a  linearly  pa¬ 
rameterized  approximator  can  be  used  to  approximate  un¬ 
known  nonlinear  value  function  Vi^i.  Several  function  ap¬ 
proximators  are  available  for  this  purpose,  such  as,  radial 


basis  function  (RBF)  neural  networks  [5,  17],  high-order 
neural  networks  [7]  and  fuzzy  systems  [24],  which  are  de¬ 
scribed  as  W'^ S{z)  with  input  vector  z  €  i?",  weight  vec¬ 
tor  W  G  R\  node  number  I,  and  basis  function  vector 
S{z)  G  ■  Universal  approximation  results  indicate  that, 
if  I  is  chosen  sufficiently  large,  then  W'^S(z)  can  approxi¬ 
mate  any  continuous  function  to  any  desired  accuracy  over 
a  compact  set  [7,  17]. 

In  this  paper,  we  assume  that  the  value  functions  U  ^ 
are  approximated  by  RBF  neural  networks.  That  is,  for 
the  unknown  value  functions  Vi^i{xi,  SijXj),  we  have  the 
following  approximation  over  some  compact  set  fli 

^ijl^Xi,  SijXj^  —  -f  (Xj) ,  VXj  G  (21) 

where  Xj  =  [sjiXi,  Si2X2,  •  •  •  ,Xi,  -  ■■  ,  SijXj,  ■  ■  ■  ,SiNXN]'^, 

0*  I  G  is  an  unknown  constant  parameter  vector,  the 
neural  network  node  number  k  >  1,  uiij{xi)  is  the  approx¬ 
imation  error,  and  ^^,i{x^)  =  [4>i^^i,(j)i2,i,  -  ■  ■  is 

the  known  basis  function  vector. 

Upon  using  the  function  approximator  (21),  the  policy 
evaluation  equation  in  (19)  becomes 


=  [^^,lix,{t))  -  $j,,(xj(/ +  T))]^0*,  +cCj//),(22) 

where  -uJi^i{t  +  T). 


Remark  3.1  Based  on  the  universal  approximation  theo¬ 
rem  [7,  17],  approximation  error  uJij(xi)  will  uniformly 
converge  to  zero  as  the  neural  network  node  number  li  -G 
oo.  In  other  words,  jU,;  ~  —t  0  as  li  —>■  oo.  Thus, 

-G  0  as  li  ^  OO,  which  implies  that  (22)  can  be 
used  as  an  approximation  for  the  policy  evaluation  in  the 
proposed  multiagent  policy  iteration  algorithm.  o 

It  follows  from  (22)  that  9*  ^  provides  the  best  approx¬ 
imate  solution  for  the  policy  evaluation.  However,  its  value 
is  unknown,  and  needs  to  identified  online.  Let  0i,i{t)  be 
the  estimate  of  9*^,  and  equation  (22)  becomes 


=  -  $i,;(xi(/  -f  T))]'^ 9i^i(t)  -f  Ci^iii^i) 

where  =  [T>j,z(xi(/))  -  $j,z(xj(/  -f  T))]'^  9^^i{t)  -f 

dJi,i{t),  9i,i{t)  =  9* I  —  9i,i{f).  Thus,  given  any  admissible 
cooperative  control,  the  parameter  9iy  should  be  chosen  to 
minimize  the  squared  approximation  residual  error  e'f  i{t). 
As  9i,i(t)  — )•  9*1,  it  is  obvious  that  ei^i{f)  — >  uji^i. 

In  what  follows,  we  present  the  proposed  adaptive 
law  for  9i^i  using  the  least-squares  estimation.  To  pro¬ 
ceed  with  the  proposed  adaptive  design,  we  introduce  an 

infinite  sequence  of  time  instants  {tk  =  fo  +  kT}  for 

k  G  M  =  {0, 1,  •  •  •  ,  }  with  T  >  0  the  sampling  time.  The 
proposed  adaptive  multiagent  policy  iteration  algorithm  re¬ 
lies  on  two  types  of  updating  intervals: 
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1)  Control  action  interval  \tk^tk+l),  in  which  the  same 
control  policy  Ui^i  will  be  applied; 

2)  Observation  interval  in  which  n  sets  of 

control  action  with  the  same  control  policy  Uij  will 
be  applied,  and  observation  data  during  the  n  intervals 
will  be  used  for  the  least-squares  estimation  of 

Remark  3.2  Note  that  the  observation  interval  is  imposed 
for  the  least-squares  solution  ofVi^i  based  on  (23)  for  the 
policy  evaluation  step  in  the  proposed  multiagent  policy 
iteration  algorithm,  while  the  control  action  interval  corre¬ 
sponds  to  the  implementation  of  control  Ui^ifrom  the  policy 
improvement  step.  o 

For  notational  convenience,  let  us  define 


Zi{tk) 


and 


=  ^i,l{Xi{tk))  -  ^i,l{x^{tk+l)). 

Substituting  this  into  (23)  yields 

Ziitk)  =  'i’i,i{tk)'^0i^i  -f  eij{tk)  (24) 

The  model  in  (24)  is  the  regression  model  for  policy  itera¬ 
tion  and  rhi ;  is  called  the  regressor.  Through  the  observa¬ 
tion  interval  [tk,  tk+n],  pairs  of  observations  and  regressors 
{{zi{tk+^),'i’i,i{tk+f^)),  fJ.  =  0,  l,n-  1}  are  obtained  by 
using  control  policy  The  parameter  i  will  be  chosen 
to  minimize  the  least-squares  loss  function 

^  n— 1 
fi=i 

To  this  end,  standard  least-squares  estimation  algorithm 
renders 

9,,i  =  (25) 

where  =  [zi{tk),z,{tk+i),- ■  ■  ,  and 

Thus,  according 

to  policy  improvement  step  in  (16),  and  noting  = 
9i,i  the  control  law  is 

n,,z+i  =  --R-^gJ (26) 
2  oxi 

The  above  results  can  be  summarized  into  the  follow¬ 
ing  proposition. 

Proposition  3.1  Under  assumptions  2.1,  2.2  and  ??,  the 
control  law  (26)  with  adaptive  law  (25)  approximately 
solves  the  optimal  cooperative  consensus  problem  for  mul¬ 
tiagent  nonlinear  system  (1)  by  minimizing  the  cost  func¬ 
tion  (3). 


Proof:  The  proof  can  naturally  be  done  following  the 
above  multiagent  policy  iteration  design  steps,  the  least- 
squares  estimation,  the  claims  in  lemma  3.1,  theorem  1  as 
well  as  the  universal  approximation  theorem  for  neural  net¬ 
work  function  approximation.  □ 


Remark  3.3  The  implementation  of  estimation  algorithm 
in  (25)  requires  an  excitation  condition  for  matrix 
which  could  be  satisfied  with  the  careful  choices  of  basis 
function  for  neural  network  approximators.  To  further  re¬ 
duce  the  computation  load  due  to  the  computation  require¬ 
ment  for  matrix  inverse,  in  what  follows,  we  give  a  simpli¬ 
fied  adaptive  recursive  algorithm  for  Oi^i.  O 

The  simplified  adaptive  recursive  update  for  0iy 
is  based  on  Kaczmarz’s  project  algorithm  [1].  That 
is,  one  pair  of  data  {zi{tk),'i’i,i{tk)}  generates  an 
estimate  9ij{tk).  Once  a  new  measurement  pair 
{zi(tk+i),  is  obtained,  it  is  natural  to  choose 

the  new  estimate  6ij{tk+i)  as  that  minimizes  the  following 
cost  function 

^  =  2^9i,i{tk+i)  —  {9i,i{tk-i-i)  —  9i,iitk)) 

-\-X(zi{tk+i)  —  'i>ij{tk+i)0i^i(tk+i)),  (27) 

where  A  is  a  Lagrangian  multiplier.  Taking  derivatives  with 
respect  to  Oi^fitk+i)  and  A,  we  obtain 

9i,i(tk+i)  —  9i,i{tk)  —  X^i^i{tk+i)  =  0,  (28) 

Zi{tk+i)  —  ^  i,i{tk+i)9i,i{tkJri)  =  0-  (29) 

Solving  the  above  equations  yields 


-f 


9i,l{fk+l)  —  9i,l{lk) 


'^i,l{tk+l)^'^i,l{tk+l, 


fc-l-lj 


To  avoid  the  possible  singularity  for  the  term 
for  computation  stability,  a 
modified  algorithm  for  (30)  would  be  used  in  practice  as 
given  below  by  the  double-column  formula  (31),  where 
7  >  0  is  the  learning  rate  and  a  is  a  positive  constant. 

In  summary,  the  proposed  approximately  adaptive 
multiagent  policy  iteration  (MPI)  algorithm  is  given  in  Al¬ 
gorithm  1. 


Algorithm  1  Approximately  Adaptive  MPI  Algorithm 

1:  Let  I  =  0.  Given  initial  states  xfitf),  SijXj{to),  let  the 
initial  admissible  cooperative  control  policy  be  Ui^. 

2:  Employ  the  control  policy  Uij,  and  within  the  observa¬ 
tion  interval  [tixn,  t(^i+i)n-i],  collect  the  data  pairs 

{  (-2'i  (f  /  X  j  ^i,0  X  )  )  ;  T  —  0,  1,  TT  1} 

3:  Solve  for  9i  i  from  (31). 

4:  Solve  for  from  (26). 

5:  Let  /  3—  ( -f  1,  and  go  to  step  2,  until 

where  e  >  0  is  a  sufficiently  small  predefined  thresh¬ 
old. 
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(31) 


7^»,i(4+l) 


'i’i,l{tk+l)^'^i,l{tk+l)  +  OL 


)  ~  '^i,l{tk+l)di,l{tk)) 


4  Simulation  Results 


To  illustrate  the  proposed  approximately  adaptive  coopera¬ 
tive  control,  we  consider  a  simple  multiagent  system  with 
3  agents  modeled  by  the  following  single  integrators 


Xi  =  Ui,  1  =  1,  2,  3  (32) 


where  Xi  G  Stt,  and  Ui  G  Stt.  Let  the  sensing/communication 
topology  among  3  agents  be  given  by 


S  = 


1  1  0 
0  1  1 
1  0  1 


Apparently,  S  matrix  is  complete,  and  admissible  cooper¬ 
ative  control  exists  for  the  consensus  of  (32).  Select  the 
weight  matrices  in  (3)  as  Qij  =  =  0.25  for  sim¬ 

ulation  purpose.  We  use  a  single  neural  node  approxi¬ 
mator  for  each  value  function  Vi^i.  Based  on  S  matrix, 
we  choose  the  basic  functions  as  T*!./  =  (xi  —  X2)^, 
^2,1  =  {x2  —  and  $3  ;  =  {x^  —  Xi)'^  for  value  func¬ 
tions  Vi V2,i  and  V3J,  respectively.  System  initial  states 
are  xi(0)  =  0.5,a::2(0)  =  0.2  and  a;3(0)  =  0.3.  Apply¬ 
ing  the  proposed  approximately  adaptive  MPI  algorithm  in 
Algorithm  1,  the  correspondingly  cooperative  controls  are 
of  the  form  (26),  and  the  adaptive  laws  for  ;  are  given 
by  (31).  Figure  1  shows  that  system  states  consensus  is 
achieved,  figure  2  displays  the  instantaneous  cost  values. 
Figure  3  illustrates  the  optimal  cooperative  control  inputs, 
and  the  convergence  of  neural  network  weights  estimates  is 
shown  in  figure  4. 


Figure  2.  Instantaneous  cost  values  versus  time 


Figure  1.  Consensus  of  Xi{t) 


Figure  3.  Optimal  cooperative  controls 


5  Conclusions  Figure  4.  Parameters  of  neural  networks  versus  time 

In  this  paper,  we  proposed  a  new  approximately  adaptive 
online  multiagent  policy  iteration  algorithm  to  address  the 
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optimal  cooperative  consensus  control  problem  for  a  gen¬ 
eral  class  of  nonlinear  multiagent  systems.  The  proposed 
design  relies  on  iterative  policy  evaluation  and  policy  im¬ 
provement  by  using  neural  network  based  online  adaptive 
estimation  for  optimal  value  functions.  Simulation  results 
further  verified  the  effectiveness  of  the  proposed  design. 
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Experimental  Validation  of  Distributed  Cooperative  Control  for  Mobile 
Agents  with  Switching  Topologies  and  Time-Delays"*" 

Junzhen  Shao^  Jing  Wang^  and  Tianyu  Yang^ 


Abstract — In  this  paper,  we  present  practical  experimental 
results  to  demonstrate  a  control  law  for  consensus  of  multiagent 
systems  with  switching  topologies  and  time  delays.  The  nonlin¬ 
ear  control  law  utilizes  nonlinear  cooperative  control  gains  and 
uses  contraction  mapping  to  achieve  consensus  of  multiagent 
systems.  The  testing  platform  we  used  consists  of  a  number 
of  mobile  robots.  We  present  the  effectiveness  of  the  control 
law  design  by  Aria  mobile  robots  with  applications  in  dis¬ 
tributed  cooperative  formation  control.  Computer  simulations 
and  hardware  experiments  presented  include  point  consensus 
control  and  formation  control,  both  with  changing  topologies 
and  time-delays  using  directed  and  undirected  communication 
topologies. 


1.  Introduction 

Cooperative  control[5][10]  aims  at  achieving  consensus  or 
agreement  dynamics  in  a  multiagent  system.  It  is  an  area 
of  research  lying  at  the  intersection  of  systems  and  graph 
theory.  A  prominent  application  area  of  cooperative  control  is 
autonomous  systems,  especially  for  military  and  government 
applications.  The  development  of  single  agent  systems  is 
increasingly  mature  in  recent  years.  For  example,  unmanned 
aerial  vehicles  (UAV)  and  autonomous  underwater  vehicles 
(AUV)  play  an  important  role  in  applications  in  severe  envi¬ 
ronments  or  classified  operations.  On  the  other  hand  Cooper¬ 
ative  control  of  multiagent  systems  can  enhance  the  system 
performance  for  applications  like  patrolling,  monitoring,  etc. 
The  design  of  cooperative  control  is  closely  related  to  system 
dynamics.  For  linear  systems,  the  dynamics  can  be  simplified 
to  the  first-order  integrator  model  or  the  double  integrator 
model[l][2].  For  nonlinear  systems,  which  are  more  relevant 
to  real  world  applications,  cooperative  control  becomes  much 
more  complicated,  and  large  gaps  exist  between  theoretical 
system  design  and  practical  applications[l][3]. 

The  multigent  system  is  a  computerized  system  of  multiple 
interacting  intelligent  agents  within  an  environment[5][13], 
and  the  agents  work  together  to  accomplish  certain  tasks. 
Each  agent  in  the  system  has  the  capability  of  self-operating. 
There  are  two  key  topics  in  the  research  of  multiagent  sys¬ 
tems:  the  design  of  cooperative  control  laws,  and  the  control¬ 
lability  of  networks.  The  network  communication  topology 
plays  a  key  role  in  accomplishing  consensus  tasks.  From  this 
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perspective,  several  different  communication  strategies  have 
been  proposed[2][7].  A  popular  method  is  the  leader-follower 
model,  in  which  one  agent  plays  as  the  leader,  and  other 
agents  communicate  with  the  leader  when  performing  the 
tasks.  This  model  has  little  communication  requirement  and 
short  reaction  time.  Nevertheless,  the  entire  system  breaks 
down  once  the  leader  agent  is  disabled. 

In  this  paper,  we  experimentally  validate  the  effectiveness 
of  the  nonlinear  cooperative  control  proposed  in  [1],  which  is 
demonstrated  through  discontinuous  cooperative  control  for 
consensus  of  multiagent  systems  with  switching  topologies 
and  time-delays  using  mobile  robots[l][2][12][15].  By  de¬ 
signing  nonlinear  piecewise  control  gains,  the  consensus  or 
formation  of  multiangent  systems  with  switching  topologies 
and  time-delays  are  achieved  both  in  software  simulations 
and  hardware  experiments. 

II.  Problem  Formulation 

The  dynamics  of  a  group  of  mobile  agents  are  expressed 
as 


Xi  =  Vi  cos  9i ,  ji  =  V;  sin  9i ,  9i  =  cOi  ( 1 ) 

where  x,  and  y,  denote  the  position  of  the  ith  agent,  0,  shows 
the  orientation  which  is  based  on  the  driving  velocity  v,  and 
steering  velocity  w,-.  In  this  case  (x,',y,)  G  (v,-,®,)  G  M 
and  /  G  1 ,  •  •  •  ,  n. 

Let  us  define  the  desired  trajectory  for  the  group  of  agents 
as 


qoit)  =  G  (2) 

And  the  motion  frame  is  denoted  as  F(f),  which  can  be 
considered  as  a  constraint  in  geometric  coordinates  in  terms 
of  relative  positions  of  the  robots.  F(f)  consists  of  qo{t)  and 
the  orthonormal  vectors,  ei,e2,  as  defined  below. 


r  ^o(0  j 

ei(f)  = 

en{t) 

_e\2{t)_ 

= 

Mt) 

-V[-*o(0]^+[jo(fF. 

r  ^o(d  j 

e2{t)  = 

e2\{t) 

e22{t)_ 

= 

M‘) 

.Vl^o{t)]^+lyoit)P . 

(3) 

(4) 


Based  on  the  orthonormal  vectors  e,(f)  and  the  trajectory 
information  qo{t),  the  agent  position  is  given  as. 


Pi{t)  =  anei{t)  +  ai2e2{t),  (5) 
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where  P,(f)  is  the  position  of  the  ith  robot,  and  a,,  are 
constants  determining  the  formation  shape. 

The  sensing/communication  information  exchange 
among  the  fleet  agents  can  be  expressed  by  the 
sensing/communication  matrix, 


II 

1 

— 

an{tl)  ■■ 

azi 

1 - 

" 

<N 

1 

S 

^mm  \ 

where  at  (f^) :  A:  =  0, 1 ,  •  •  • ,  the  ith  agent  receives  velocity,  ori¬ 
entation  and  position  information  from  agent  j,  if  Sij{t^f.)  =  1. 
Otherwise,  if  =  0,  there  is  no  communication  between 
agent  i  and  agent  j. 

Define  the  sign{z)  function  as, 

r  1,  z>o 

sign{z)=<  0,  z  =  0  (7) 

[  -1,  z<0 

For  7^  0,  the  control  model  can  be  expressed  as. 


Ui{t)  =  ^  a{sij{tl),Pj{tl))sign{Pj{t)  - Pi{t)),  (8) 

j=i 

where  f  €  o:())  is  ^  nonlinear  control  gain. 

III.  Control  Design 

In  this  paper  we  use  formation  control  to  demonstrate  the 
consensus  of  multiagent  systems  with  the  new  control  law 
(8).  First,  we  use  the  robot  model  (1),  and  define  x  =  x  + 
Rcos9,y  =  y  +  Rsin9.  Therefore, 

X  =  vcos9  —  Rsin9co,y  =  vsin9  —Rcos9(0,  (9) 

Now,  we  can  linearize  the  robot  model  as  (10),  and  the  real 
control  inputs  are  expressed  as  (11). 

X=Ux,X=Uy,  (10) 


The  ith  robot’s  control  design  without  time-delays  is  based 
on  (2)(3)(4)(8)(12),  and  is  given  by, 

Ui  =  a{sijitl),Pj{tl))sign{Pj{t)  - Pi{t))  +  qi{t),  (13) 
j=i 

in  (13),  is  the  nonlinear  control  gain.  S{P^)  can 

be  changing  to  reflect  different  types  of  communication 
strategies  such  as  leader-follower,  global  communication  and 
neighbor-follower,  etc. 

The  design  of  nonlinear  control  gain  for  global  commu¬ 
nication  and  simple  directed  communication  without  time- 
delay  can  be  given  as. 


Li=i^u{tiy 


(14) 


which  has  been  proved  in  [1].  In  this  paper,  we  show 
through  computer  simulations  and  robots  experiments  that, 
the  control  gain  (14)  is  sufficient  for  systems  with  directed 
communication  topologies.  However,  this  control  gain  may 
fail  to  achieve  consensus  when  applied  to  systems  with 
undirected  communication  topologies.  Therefore,  we  adopt 
the  new  nonlinear  control  law  for  undirected  communication 
as  follows  [1].  Let  the  nonlinear  control  gain  be  designed 
as, 

case  7:  if  Pi{tl)  =  maxjf=N.Pj{t^)  =  minjf=N.Pj{tl),aijCan  be 

any  bounded  positive  value. 

case  2:  if  Pi{tl)  >  maxjeNiPjitl),  then  alpha{sij  can  be 
ranged. 


o< 

jm 


minjeN,Pj{tl) 

c 


(15) 


case  3\  if  F’,(f|)  <  minj^N^Pj{ty),  then  alpha{sij  can  be 
selected. 


n 


0  <  ^  a{sij{tl)  < 

jeNi 


maxj^N.Pj{tl) 

c 


(16) 


case  4:  if  min j^N.Pj{tl)  <  Pj{tl)  <  maxjeNjPjitl),  let 
alpha{sij  be  selected  to  satisfy. 


V 

cos  9 

sin  9 

WjC 

(0 

= 

sinO 

COS  6 

+ 

liy 

R 

R  . 

V 

UxCOsO  +  UysinO 

(0 

— 

UxSinO  1  UyCosO 

L  R  '  R  ^ 

The  moving  velocity  for  the  ith  robot  during  formation  to 
a  certain  shape  following  certain  predetermined  track  can  be 
expressed  as  (12)  based  on  (2), 

qi{t)  =  q^t)  +  ^  a{sij{tl),Pj{tl)),  (12) 

.1=1 

This  can  be  considered  as  the  derivative  value  based  on 
the  robot’s  relative  velocity.  For  each  robot  in  the  system, 
we  can  assume  the  velocity  of  the  robot  to  be  a  constant, 
and  others  follow  the  robot  based  on  (12). 


0<  E  a{sijitl)<mm[{9),ilQ)],  (17) 

jeNi 

where  c  could  be  any  positive  constant.  This  theorem  has 
been  proved  in  [1]  analytically. 

IV.  Simulation  and  Experiments 
A.  Software  Results 

In  the  Matlab  environment,  we  demonstrate  the  new  con¬ 
trol  algorithm  for  multiagent  systems.  All  the  demos  consist 
of  four  agents  performing  the  consensus  and  formation 
control  tasks  with  two  communication  strategies.  One  is  the 
directed  communication  topology,  which  means  each  agent 
knows  the  connected  neighbor  agents’  information  in  terms 
of  velocity,  position  and  orientation.  The  other  is  called  least 
restrictive  communication  topology,  in  which  each  agent 
knows  their  neighbors’  information  in  one  way  only. 
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To  implement  the  control  law  shown  in  (13),  for  the 
directed  communication  topology,  we  consider  a  group  of 
four  mobile  robots  achieving  a  rhombus  formation  while 
following  a  circular  movement.  For  the  general  setting 
as  expressed  in  (12)  (13),  we  set  q'o(f)  =  [Zcost ,2sint]^ , 
and  ei(f)  =  [—sint,cost]^ ,62(1)  =  [—cost,—sint]^.  The  four 
robots  are  following  a  circle,  and  the  control  gain  applied 
is  (14).  The  update  time  =  0.05i.  First,  the  control  input 
(13)  is  adopted,  which  contains  the  sign  function.  Second, 
to  simulate  the  control  law  without  the  sign  function,  simply 
remove  the  sign  function  part. 

In  the  first  set  of  simulations,  the  directed  communication 
topology  is  adopted  and  we  compare  the  system  performance 
with  and  without  the  sign  function.  Four  robots  are  designed 
to  either  converge  to  one  point  or  form  a  certain  shape  (while 
making  circular  movements).  Sij{ti^)  is  applied  as. 


T 

0 

0 

o' 

'1 

1 

0 

r 

0 

1 

0 

0 

,s2  = 

0 

1 
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0 
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0 

0 

0 

1 
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1 

0 

1 

1 

0 

0 

0 

1 

'1 
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'1 
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o' 
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1 

1 
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0 
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0 

1 

0 

0 

0 

1 

The  obtained  trajectories  are  shown  in  figures  1-4, 


Directed  Topology  for  Agents  Consensus  without  Sign  Function 


Fig.  1:  Consensus  of  four  agents  without  the  sign  function 
with  directed  communication  topology,  starting  from  four 
different  positions  and  converging  to  one  point.  Time  to 
consensus  is  T=4.7s 


Directed  Topology  for  Agents  Consensus  with  Sign  Function 


Fig.  2;  Consensus  of  four  agents  with  the  sign  function 
with  directed  communication  topology,  starting  from  four 
different  positions  and  converging  to  one  point.  Time  to 
consensus  is  T=3.6s 

Directed  Topology  Formation  Control  with  Sign  Function 


Fig.  3:  Four  robots  starting  from  different  positions  and 
forming  a  rhombus  while  following  a  circle  with  the  sign 
function  and  the  directed  communication  topology.  Time  to 
the  desired  formation  is  T=3.3s 


Directed  Topology  Formation  Control  without  Sign  Function 


Fig.  4:  Four  robots  starting  from  different  positions  forming 
a  rhombus  while  following  a  circle  without  the  sign  function 
and  with  the  directed  communication  topology.  Time  to  the 
desired  formation  is  t=4.4s 
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For  the  undirected  communication  topology,  we  adopt 
the  least-restrictive  neighbor  communication  topology, 
with  each  robot  only  communicating  with  a  neighboring 
agent.  There  is  no  leader  in  the  system,  and  minimum  data 
transferring  is  incurred.  i,;(4)  is  set  as. 
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The  control  law  with  time-delays  and  the  sign  function 
can  be  expressed  as  (21),  with  set  to  one  second, 

n 


Undirect  Topology  for  Agents  Consensus 


-20  -10  0  10  20 
X 


Fig.  6:  Consensus  of  four  agents  using  new  rules  with  the 
sign  function  and  time-delays  with  the  directed  communi¬ 
cation  topology.  They  start  from  four  different  positions  to 
converge  to  one  point  in  3.2s. 


Ui  =  £  a{sij{t[),Pj{tl)  -  Xij)sign{Pj(t  -  Tij)-Pi{t))+qi(t), 
l=\ 

(21) 

The  control  gain  design  is  specified  in  (14)(15)(16)(17), 
i.e.,  every  time  an  agent  receives  a  neighbor’s  position  infor¬ 
mation,  the  software  compares  this  value  with  the  maximum 
or  minimum  value  as  shown  in  (14)(15)(16)(17),  then  the 
control  gain  is  selected  accordingly. 

As  we  mentioned  before,  the  control  gain  in  (14)  may  not 
be  a  appropriate  for  undirected  communication  topologies. 
Agents  may  fail  to  form  the  desired  shape.  Figure  5  below 
gives  an  example  of  such  scenario  with  the  continuous 
control  gain  (14).  The  four  agents  fail  to  converge  within 
60s  and  the  trajectory  contains  oscillation. 


Undirected  Topology  for  Agents  Consensus  with  Sign  Function 


Undirected  Topology  Formation  Control 


Fig.  5;  Consensus  of  four  agents  with  the  sign  function  and 
time-delays  with  the  directed  communication  topology.  They 
start  from  four  different  positions  and  attempt  to  converge 
to  one  point,  but  fail  to  converge  within  60s. 

As  we  apply  the  rules  (14)(15)(16)(17),  the  system  works 
satisfactorily  under  the  same  conditions,  as  shown  in  figure 
6.  A  formation  example  is  also  displayed  in  figure  7, 


Fig.  7:  Four  robots  starting  from  random  positions  form  a 
rectangle  following  a  certain  circular  movement.  Time  to 
the  desired  formation  t=3.5s 

B.  Hardware  Experiments 

We  also  implemented  the  new  control  law  in  the  Aira 
mobile  robots.  The  robots  know  the  initial  position  of  them¬ 
selves,  but  when  they  are  moving,  each  robot  only  knows 
the  velocity  and  the  position  information  from  one  of  its 
neighboring  robots.  qi{t)  is  set  based  on  the  requirement 
of  the  consensus  speed.  All  the  Aira  demos  shown  are  for 
the  distributed  formation  control  of  multiagent  systems  with 
switching  topologies.  qi{t)  is  set  to  2QQmm  in  both  x  and 
y  orientations.  The  communication  topology  (19)  is  adopted 
with  the  control  law  (13).  At  this  time,  the  control  gains  are 
designed  as  Kx=120  and  Ky=20.  Figure  8  shows  the  position 
information  of  robot  1  and  figure  9  compares  the  experimental 
results  with  the  theory. 

From  figure  9  we  can  see  the  experimental  value  Kx=120 
fits  the  theoretical  control  gain  very  well  based  on  (15). 
Similarly,  for  the  y  domain,  the  experimental  value  fits  well 
with  the  theoretical  value  (17). 
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Position  Shift  of  Robotl  in  X  and  Y  domain 


Fig.  8;  Referring  to  Robotl  and  Robot2,  Y  is  the  distance 
between  two  robots  in  x  and  y  domains.  We  sampled  120 
points,  one  point  per  second. 


Fig.  9:  X  and  Y  control  gains  comparison  between  theory 
and  experiments.  We  sampled  120  points,  one  point  per 
second. 

Figures  10  and  11  are  a  series  of  images  showing  the 
Aira  robot  simulations  in  our  lab. 


Square  to  Line 


Fig.  10;  Rectangle-to-line  and  line-to-rectangle  formation 
control  with  undirected  communication  and  the  sign  func¬ 
tion. 


Figure  11  shows  the  formation  shape  changing  from 
rectangle  to  line,  then  to  rhombus,  and  finally  converging 
to  one  point.  K  is  set  to  60,  moving  velocity  qi{t)  =  100mm 
with  the  same  communication  topology  (19). 


Format  from  Rectangle  to  tine  to  Rhombus  End  in  a  Point 


Fig.  11:  Formation  changes  from  rectangle  to  line  to  rhom¬ 
bus  and  ends  with  a  point  with  undirected  communication 
and  the  sign  function 


V.  Conclusion 

In  this  paper,  we  demonstrate  the  distributed  coopera¬ 
tive  control  for  the  consensus  of  multigent  systems  with 
switching  topologies  and  time-delays  using  Matlab  and  Aira 
mobile  robot  experiments.  Matlab  simulations  confirmed  the 
advantages  of  using  the  sign  function  in  terms  of  short 
consensus  time  and  assurance  of  consensus,  compared  with 
the  control  law  without  the  sign  function.  Also,  the  results 
obtained  through  the  discontinuous  design  of  control  gains 
validated  the  effectiveness  of  such  design  in  consensus  and 
formation  problems.  Another  important  beneht  is,  this  new 
control  law  can  be  easily  loaded  into  real  robots,  so  practical 
implementation  issues  can  be  studied  experimentally,  such  as 
time-delays,  hardware  limitations,  etc. 
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Formation  Control  of  Multiple  Nonholonomic  Mobile  Robots  with 
Limited  Information  of  a  Desired  Trajectory 


Jing  Wang,  Morrison  Obeng,  Tianyu  Yang,  Gennady  Staskevich,  and  Brian  Abbe 


Abstract — In  the  study  of  task  coordination  for  multiagent 
systems,  formation  control  has  received  considerable  attention 
due  to  its  potential  applications  in  civil  and/or  military  prac¬ 
tices.  Fundamentally,  formation  control  problem  for  multiagent 
systems  can  be  formulated  as  making  a  group  of  agents  follow 
the  desired  trajectory  while  maintaining  certain  prescribed 
geometric  distances  among  agents.  In  this  paper,  we  consider 
the  formation  control  problem  for  mobile  robots  with  nonlinear 
dynamics  and  moving  in  a  2D  environment.  To  address  the 
inherent  challenges  due  to  nonlinear  system  dynamics  and 
agents’  limited  sensing/communication  capabilities,  we  instill  an 
idea  of  integrating  the  recently  developed  distributed  consensus 
theory  into  the  standard  feedback  control,  and  propose  a  new 
time-varying  cooperative  control  strategy  to  solve  the  forma¬ 
tion  control  problem  for  multiagent  systems.  In  particular, 
the  proposed  design  only  requires  the  local  and  intermittent 
information  exchange  among  agents  to  achieve  the  formation 
control  objective.  More  importantly,  we  remove  the  restriction 
on  the  need  of  the  desired  trajectory  for  every  agent,  and 
instead  design  a  distributed  observer  for  obtaining  the  desired 
trajectory  in  order  to  establish  the  formation  in  the  design. 
The  overall  distributed  formation  control  system  stability  is 
rigorously  proved  by  using  a  contraction  mapping  method 
under  the  condition  that  the  sensing/communication  network 
among  robots  is  sequentially  complete.  Simulation  is  provided 
to  validate  the  effectiveness  of  the  proposed  design. 

1.  Introduction 

Coordination  of  multiagent  systems  has  been  and  con¬ 
tinues  to  be  an  active  research  area  in  the  current  control 
community,  and  recent  years  have  seen  a  significant  progress 
in  the  development  of  distributed  consensus  strategies  for 
multiagent  systems  [24]  [23]  [25]. 

The  developments  have  been  primarily  focused  on  ad¬ 
dressing  two  aspects  of  issues.  The  first  issue  is  on  the  study 
of  network  controllability.  The  core  is  to  identify  the  least 
required  sensing/communication  conditions  for  completing 
coordination  tasks.  A  significant  result  was  obtained  in 
[10],  in  which  the  sensing/communication  topologies  are 
modeled  using  an  undirected  graph  and  its  uniform  connec¬ 
tivity  provides  the  sufficient  condition  for  agents  consensus. 
This  condition  was  further  relaxed  to  take  into  account  the 
directed  graph  [25]  [27],  and  the  existence  of  a  spanning  tree 
in  the  graph  is  necessary  and  sufficient  for  group  coordina¬ 
tion.  In  our  recent  work  [24],  we  addressed  this  issue  by 
using  matrix  theory,  and  introduced  the  notion  of  sequential 
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completeness  of  sensing/communication  matrix  sequences 
to  describe  the  connectivity  condition.  The  second  issue  is 
on  the  design  of  cooperative  control  and  stability  analysis. 
The  cooperative  control  for  linear  systems  are  thoroughly 
studied  in  [13][10][27][8][24].  For  nonlinear  systems,  some 
results  are  available  by  using  passivity-based  design  in  [2], 
Lyapunov  design  in  [23],  and  set-valued  Lyapunov  functions 

[19] [16]. 

The  aforementioned  results  are  mostly  centered  on  the 
study  of  consensus  problem  for  multiple  dynamical  systems. 
One  of  direct  applications  of  those  results  is  to  tackle  the 
formation  control  problem  for  real  systems.  Fundamentally, 
formation  control  problem  can  be  formulated  as  making  a 
group  of  agents  follow  the  desired  trajectory  while  maintain¬ 
ing  certain  prescribed  geometric  distances  among  agents.  In 
this  paper,  we  consider  the  formation  control  problem  for 
nonholonomic  mobile  robots  with  nonlinear  dynamics  and 
moving  in  a  2D  environment.  Progress  has  been  made  in 
solving  formation  control  problem  by  using  leader-follow 
strategies,  virtual  structure  method,  and  artificial  potentials 
[26][11][22][7][5][17][9][1][14][18].  However,  most  results 
are  obtained  by  either  assuming  linear  system  dynamics  or 
converting  the  nonholonomic  robot  model  into  linear  one 
through  feedback  linearization.  It  is  well  known  that  it  is 
challenging  to  design  feedback  control  for  mobile  robots 
with  nonholonomic  constraints  [21].  There  are  some  results 
for  studying  the  cooperative  control  of  nonholonomic  robots 
[6][15][20][28].  Among  them,  a  discontinuous  control  was 
proposed  in  [6]  and  nonsmooth  Lyapunov  theory  and  graph 
theory  are  used  for  stability  analysis.  In  [15],  based  on  the 
Frenet-Serret  model  of  unicycle,  time-varying  controls  were 
designed  and  analyzed  using  average  theory.  The  work  in 

[20]  [28]  assumed  the  constant  driving  velocity  and  controls 
were  only  designed  for  steering  velocity.  In  our  recent  paper 
[30],  we  proposed  a  distributed  cooperative  steering  control 
design  for  a  class  of  networked  dynamical  systems  with 
inherent  nonlinear  dynamics.  A  number  of  conditions  were 
established  in  terms  of  the  properties  of  the  cooperative 
steering  control  for  achieving  cooperative  behaviors. 

In  this  paper,  we  present  a  new  solution  for  the  formation 
control  design  of  nonholonomic  mobile  robots.  The  nonholo¬ 
nomic  constraints  of  robots  are  explicitly  taken  into  account 
in  the  proposed  design  by  converting  the  unicycle  model 
into  the  canonical  chained  form.  The  proposed  distributed 
formation  control  utilizes  the  information  exchange  to  better 
coordinate  the  motion  of  individual  robots,  but  there  is 
no  requirement  for  the  strongly  connected  communication 
topology  which  could  be  uncertain  and  unreliable  due  to 


978-1-4799-4774-4/14/$31.00  ©2014  IEEE 
Approved  for  Public  Release;  Distribution  Unlimited. 
77 


551 


communication  noises.  Instead,  we  allow  more  flexible, 
intermittent,  and  time  varying  communication  topologies 
among  robots.  A  finite-time  distributed  observer  is  designed 
to  estimate  the  desired  trajectory  information.  The  stability 
and  convergence  analysis  for  the  proposed  formation  control 
are  done  through  contraction  mapping  method  under  the 
condition  that  the  sensing/communication  network  among 
robots  is  sequentially  complete.  Simulation  result  is  included 
to  validate  the  effectiveness  of  the  proposed  control. 

The  rest  of  the  paper  is  organized  as  follows.  Section 
II  formulates  the  formation  control  problem.  Main  results 
are  presented  in  section  III,  in  which  distributed  finite¬ 
time  observer  and  nonlinear  formation  control  are  designed 
including  system  stability  analysis.  A  simulation  example  is 
given  in  section  IV.  Section  V  concludes  the  paper. 

II.  Problem  Formulation 
Consider  a  network  of  multiple  nonholonomic  mobile 
robots  with  the  individual  system  dynamics  given  by 

Xi  =  Vi  cos  6 i, 

i/i  =  ViSm0i,  (1) 

9i  =  uj^ 

where  i  €  fi  =  {I,-’’  &  3?^  denotes  the  *th 

robot’s  position,  9i  is  the  orientation,  Vi  G  ^  driving  velocity, 
and  Wi  S  3?  the  steering  velocity. 

To  make  the  method  in  this  paper  more  general,  our  design 
will  be  based  on  the  following  canonical  chained  form 

Zil  =  Uii,  Zi2  =  Ui2,  Zi3  =  Zi2Uii,  (2) 


into  which  the  model  in  (1)  can  be  converted  by  using  the 
following  state  and  input  transformations 


Zji  =  Xi,  Zi2  =  tan6»i,  Zis  =  yi,  (3) 

UJi 

Uii  =  Vi  cos  9i,  Ui2  =  - (4) 

cos"^  9i 

That  is,  once  the  formation  controls  un  and  Ui2  are  designed, 
the  corresponding  Vi  and  can  be  found  through  the  inverse 
transformation  of  (4). 

The  design  objective  of  this  paper  is  to  coordinate  the 
motion  of  individual  robots  to  follow  a  desired  trajectory 
contour  while  maintaining  certain  prescribed  geometric  for¬ 
mation  shape  through  local  information  exchange  among 
robots.  By  taking  the  whole  group  of  mobile  robots  as  a 
virtual  body  moving  along  the  desired  trajectory,  formation 
shape  of  robots  in  the  group  can  be  determined  by  a  set  of  lo¬ 
cal  coordinates  with  reference  to  the  moving  frame  attached 
to  the  desired  trajectory  (see  figure  1  for  an  illustration  of 
moving  frames). 

More  specifically,  let  qo{t)  =  [xo{t),yQ{t)]'^  G  3?^  be  the 
desired  trajectory  for  the  group  motion,  the  moving  frame 
J^{t)  attached  to  qo{t)  can  be  defined  by  the  following 
orthonormal  vectors  ei{t)  and  62 (f) 


ei(f)  = 


eii(f) 

ei2(f) 


_ iojt) _  ■ 

V[io{tW  +  [yo{tW 

yojt) 

V[io{tW  +  MtW  - 


Fig.  1.  Moving  frames  on  the  desired  trajectory 
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"yWF+MF 

Xojt) 
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Accordingly,  any  formation  consisting  of  n  robot  positions 
in  J^{t)  can  be  expressed  as  {Pi,  •  •  •  ,  P„}  with 

Pi{t)  =  audit)  -f  Q;i2e2(f),  (5) 


62 (f)  = 


621 (f) 
622(0 


where  Uij  are  constants  of  determining  the  formation.  To 
this  end,  the  formation  control  objective  can  be  recast  as  to 
design  the  control  laws  Vi{t)  and  uji{t)  for  the  ith  robot  such 


that 


lim 

t—^OO 


Xi{t) 

y^it) 


-  qo{t)  -  Pi{t) 


=  0. 


(6) 


It  is  noted  that  the  control  objective  defined  in  (6)  can 
be  achieved  through  the  standard  tracking  control  design 
for  individual  robots  if  the  desired  trajectory  qo{t)  and  its 
derivative  qo{t)  are  available  to  every  robot.  However,  such 
a  design  may  not  be  robust  in  the  presence  of  disturbance 
and  noise  measurements  due  to  the  lack  of  coordination 
among  robots.  On  the  other  hand,  the  desired  trajectory 
qo{t)  may  be  known  only  by  some  of  robots  in  the  group. 
Therefore,  it  is  desirable  to  design  distributed  formation 
control  law  for  the  jth  robot  based  on  information  exchange 
and  relative  position  measurement  between  robots  within  its 
sensing/communication  range. 

In  this  paper,  we  assume  that  the  sensing/communication 
topologies  among  robots  are  changing,  which  can  be  cap¬ 
tured  by  the  time  sequence  jf®  :  ry  =  0, 1,  •  •  •  }.  Correspond¬ 
ingly,  we  introduce  the  following  binary  matrix  to  describe 
the  sensing/communication  topology. 


S{t)  = 


Sll  Si2{t) 

521  (f)  S22 


Slq{t) 

S2qit) 


(7) 


^nl  {t)  Sn2  {t)  *  *  *  ^nn 


with  S(t)  =  S(t^),Vt  G  [f®,f®+i),  where  su  =  1;  Sij(t)  = 
1  if  the  jth  robot  is  in  the  sensing/communication  range 
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of  the  zth  robot  at  time  t,  and  Sij  =  0  if  otherwise;  and 
fg  =  to.  It  can  be  assumed  without  loss  of  any  generality 
that  0  <  C(  <  <  Ct  <  oo,  where  Cj  and  c*  are 

constant  bounds.  We  also  define  the  neighbor  set  for  the  ith 
robot  as  =  {j  G  n\sij(t)  =  1}. 

It  is  apparent  that  in  order  to  achieve  the  coordination 
task,  the  sensing/communication  topologies  defined  by  (7) 
must  satisfy  certain  connectivity  conditions.  In  our  recent 
work  [24]  [30],  we  introduced  the  notion  of  sequentially 
completeness  to  describe  the  least  required  condition  on 
network  connectivity  for  cooperative  control  design,  which 
is  restated  by  the  following  definitions. 

Definition  2.1:  Sensing/communication  matrix  sequence 
{S'(f)}  is  said  to  be  sequentially  lower-triangularly  complete 
if  it  is  sequentially  lower-triangular  and  in  every  row  i  of  its 
lower  triangular  canonical  form,  there  is  at  lease  one  j  <  i 
such  that  the  corresponding  block  Sij{t)  is  uniformly  non¬ 
vanishing. 

Definition  2.2:  Sensing/communication  matrix  sequence 
{S'(f)}  is  said  to  be  sequentially  complete  if  the  sequence 
contains  an  infinite  subsequence  that  is  sequentially  lower- 
triangularly  complete. 

As  an  example  for  sequential  completeness,  let  us  assume 
that  the  sensing/communication  topologies  for  3  robots  are 
changing  according  to  the  sequence  {S{tk),k  G  = 

{1,  2, . . .  }}  defined  below:  S{tk)  =  Si  for  k  =  dp,  S{tk)  = 
S2  for  k  =  4p  -f  1,  S{tk)  =  S3  for  k  =  dp  -f  2,  and 
S{tk)  =  S4  for  /c  =  dp  -f  3,  where  p  €  H, 


■  1 

0 

0  ■ 

1 

1 

0  ■ 

1  — 

1 

1 

0 

II 

0 

1 

0 

0 

0 

1 

0 

0 

1 

■  1 

0 

0  ■ 

■  1 

0 

0  ■ 

0 

1 

0 

,  and  S'4  = 

0 

1 

0 

1 

0 

1 

0 

0 

1 

Us.= 


of  S^ 

=  1, 

■  1 

1 

0  ■ 

1 

1 

0 

"  1  ■ 

'0“ 

1 " 

,  d  is 


A 

0  ■ 

.  ‘^A,21 

1 

It  then  follows  from  the  structure  of  [J^  Si  that  the  corre¬ 
sponding  sequence  is  lower-triangularly  complete,  and  there¬ 
fore  the  switching  sensing/communication  topologies  defined 
by  (8)  is  sequentially  complete. 

Assumption  2.1:  The  group  of  robots  defined  in  (1)  has  a 
sequentially  complete  sensing/communication  network. 


III.  The  Main  Result 

In  this  section,  we  present  a  nonlinear  formation  control 
design  for  nonholonomic  mobile  robots  (1)  with  limited 
information  of  the  desired  trajectory  qo{t).  Particularly,  the 
proposed  new  formation  control  will  be  done  with  the  aid  of 
distributed  observers  for  the  estimation  of  qo{t). 

It  follows  that  the  desired  trajectory  qo{t)  also  satisfies  the 
nonholonomic  constraints,  that  is,  we  have 

To  =  Vo  cos  6*0,  yo  =  vq  sin 6»o,  ^0  =  wq 


for  some  Uo,Wo,  and  9q.  It  is  then  readily  seen  that  the 
moving  frame  attached  to  qo  (t)  can  be  established  using  the 
rotation  matrix  in  terms  of  the  desired  trajectory  orientation 
Oo{t),  that  is. 


ei(f) 


cos  00 
sin  00 


,e2{t) 


—  sin  00 
cos  00 


To  this  end,  the  formation  control  can  be  designed  based 
on  the  real  time  estimation  of  xo{t),  yo{t)  0o{t).  The  pro¬ 
posed  distributed  observer  is  of  the  form  (for  t  G 


Xt.oit) 

=  o:ijSg^ixj,oitl)  -  Xi^oitD) 

j&Mi 

-faioSioSgn(a;o(ffe)  -  Xifi{tl)) 

(9) 

m.oit) 

=  -  y^,o{tk)) 

+a^oSiosgn{yoitl)  -  yi,o{tl)) 

(10) 

k.o{t) 

=  ^v^Sn{0j,o{tl)  -  0^,oitl)) 

jeMi 

+aMSiosgn{0o{tl)  -  0z,o{tk)) 

(11) 

where  Xi^o{t),  yifi(t)  and  0ifi{t)  are  the  zth  robot’s  estimate 
of  Xo{t),  yoit),  and  6*0 (O’  respectively,  s^o  =  1  if  and  only 
if  the  zth  robot  has  the  direct  access  to  the  information  of 
the  desired  trajectory,  aij  and  aio  are  piecewise  constant 
control  gains  to  be  designed,  and  sgn(  )  function  is  defined 
as 

(  1,  z  >  0 

sgn(z)  =  <^  0,  z  =  0 

[  -1,  z<0 

The  following  theorem  states  the  finite-time  convergence 
of  the  proposed  distributed  observers  (9),  (10)  and  (11)  under 
appropriate  choices  of  aij  and  aio. 

Theorem  1:  Consider  a  group  of  nonholonomic  mobile 
robots  given  by  (2)  with  assumption  2.1.  The  finite-time 
convergence  of  Xifi{t)  to  xo{t),  yi,o{t)  to  yo{t)  and  0ifi{t) 
to  0o{t)  can  be  guaranteed  under  the  proposed  distributed 
observers  (9),  (10)  and  (11),  if  the  control  gain  and  aio 
(for  Sio  =  1)  are  designed  as  follows:  for  agent  z, 

^  O'Zo(ffc)  ^  if  5zo  =  Ij  (12) 

jeMi 

aij{tl)  >  d,  if  Sio  =  0  (13) 

j&Mi 


where  d  is  the  upper  bound  of  |i;o(f)|)  1 2/0 (f) I  and  |0o(f)|- 
Proof:  We  prove  the  convergence  of  (9).  The  same  pro¬ 
cedure  applies  to  (10)  and  (11).  We  first  consider  the  robots 
which  have  the  direct  access  to  xo{t),  that  is,  robot  z  for  z  G 
flo  =  {*  G  :  s»o  =  !}■  Define  x^^^{t)  =  maxigHo  Xi^oit) 
and  =  max,gno  x,,oit).  Let  =  x^^^it)  - 

xo{t),  and  “  xoit).  It  follows  from  (9) 
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that 

+a^osgnixoitl)  -  -  xq  (14) 

=  -  iro"(^fc)) 

jeMi 

+a,oSgn(xo(tfc)  -  “  ±0  (15) 

Now  consider  the  evolution  of  x^^^{t)  and  a;™”(f).  If  both 
xY^Q^{t)  >  0  and  x™o*”(^)  —  0,  it  suffices  to  show  that 
x'^Q^it)  converges  to  zero  in  finite  time.  It  follows  from 
(14)  and  (12)  that 

=  -  X!  “  *^*0  -  io 

jeJ^i 

<  -  aij  -  aio  +  d  <  0,  (16) 

which  implies  that  x™o“®  will  converge  to  zero  in  a  finite 
time.  If  both  x™g“^(f)  <  0  and  x™”(f)  <  0,  it  suffices  to 
show  that  a;™"(f)  converges  to  zero  in  finite  time.  It  follows 
from  (15)  and  (12)  that 

iro”(^)  =  ajj  +  ctio  -  xq 

>  E  cXij  +  ctio  —  d  >  0,  (17) 

jeM 

which  implies  that  x™"  converge  to  zero  in  a  finite 
time.  If  x™g“^(f)  >  0  and  x™”(f)  <  0,  same  conclusion  can 
be  drawn  by  noting  both  (16)  and  (17). 

Now  let  us  consider  the  robots  which  do  not  have  direct 
access  to  xq  (f)  but  have  information  exchange  with  robots  in 
fig,  that  is,  for  robot  i  with  i  G  fli  =  {i  e  12  :  s^o  =  0,  Sij  = 
l)j  €  12o}-  To  this  end,  we  know  that  since  after  certain 
finite  time,  robot  j  (for  j  G  12g)  will  converge  to  xo{t),  thus, 
similar  analysis  can  be  done  to  show  the  convergence  of 
Xi^o  (for  *  €  ffi)  to  zero  in  finite  time.  The  above  procedure 
can  be  repeated  recursively  due  to  the  sequentially  complete 
sensing/communication  assumption.  □ 

Remark  3.1:  The  proposed  distributed  observers  are  mo¬ 
tivated  by  the  result  in  [4],  in  which  a  distributed  estimator 
with  terms  like  sgn(^”^j^  aij{xj  —  xi))  was  proposed,  while 
in  this  paper  we  consider  more  general  form  with  terms  like 
sgn(xj  -  Xi).  O 

In  what  follows,  we  present  the  distributed  formation 
control  design.  Let  us  define  an  infinite  sequence  of  time 
instants  {to  +  kT^}  for  k  G  M  =  {0, 1,  •  •  • }  and  with 
sampling  time  0  <  Tg  <  Cf.  The  control  inputs  will 
be  updated  according  to  the  sampling  time  instants.  For 
notational  convenience,  z{to  +  kTg)  is  simplified  as  z{k) 
for  any  variable  z. 

Theorem  2:  Consider  a  group  of  nonholonomic  mobile 
robots  given  by  (2)  with  assumption  2.1.  Let  the  distributed 
cooperative  control  be  for  f  G  +  fcTs,  to  +  (fc  +  1)7),) 

Uii{t)  =  afi  +  aJij  sina;(f  -  fo  -  fcTs)  (18) 

Ui2(t)  =  cos  w(f  -  to  -  kTs)  (19) 


where  w  =  ^,  0*2  7^  0  can  be  any  constant,  and 


hk 


b% 


7^'^G^j{k)[xj{k)  -  Xi{k)  -  Xj{k) 

+xf{k  +  1)], 

1  " 

^G^j{k)[zj2{k)  -  Zi2ik)], 


Tg  .  ^ 
j=i 

2a; 


(20) 

(21) 


E  -  Vji^) 

i=i 


+yfik  + 1)]  - 

—ciiiZi2{k)Tg  + 


with 

G.,(fc)  = 


Sy(fc) 


E" 


J  =  l,--  -  ,n. 


(22) 


(23) 


xf{k)  =  Xifi{k)  +  an  cos  9i^o{k)  -  C(i2sin9tfi{k) 
Viik)  =  yi,o{k)  +  ansm9i^oik)  +  ai2cos9i^oik) 


xf(fc  +  l)  =  Xtfi{k  +  1)  +  aiicos9i^o{k  +  1) 
-ai2  sin6li_g(A:  -f  1) 

yf(k  +  l)  =  yi^o{k  +  l)  +  ansm9i^oik  +  l) 
-\-ai2  cos  9^  o(k  +  1) 


where  Xip,  yiy,  and  9ifi  are  governed  by  (9),  (10),  and  (11), 
respectively,  and 


a;i,o(^  +  l)  =  Xi^oik) 

+Tg{  aijSgn{xj^oik)  -  Xi,g(fc)) 

+aioSioSgn(xg(A:)  -  Xi,g(A:)))  (24) 

yi,o{k  +  l)  =  yi,o{k) 

+Tg{  a„sgn(2/j,o(fc)  -  t/i,o(^)) 

+aioSioSgn{yo{k)  -  y^,o{k)))  (25) 

9i,o{k  +  1)  =  9ifi{k) 

+Ts{  y^  aijSgn{9jfi{k)  -  6'i,o(fc)) 

+aioSiosgn{9o{k)  -  9^fi{k)))  (26) 

Then  the  formation  control  objective  (6)  is  achieved. 

Proof:  Directly  applying  controls  (18)  and  (19)  to  (2) 
yields 


Zii{k  +  1) 

z^2{k  +  1) 


Zisik  +  1) 


Zii{k)  +  diiTg 
Zi2{k)  +  b'l^Tg 

Zisik)  +  anZi2{k)Tg  + 
uj  2uj 


(27) 

(28) 


(29) 
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Substituting  (20),  (21)  and  (22)  into  the  above  equations,  and 
noting  that  and  can  be  rewritten  as 


^i2  — 


2a; 


+yfm  +  {yf{k  +  l)-yf{k))- 
-a'l^za{k)T,  + 


LjJ 


To  show  the  convergence  of  C,ii{k),  let  us  define  12  = 
{!,•  •  •  ,n}  to  be  the  set  of  indices  on  state  variables,  and 
at  time  instant  fg  +  kTg,  let  Cmax(^)  =  maxjCji(A:)  and 


f(k)+xf(k)] 

Cmin(*)  =  xamjCji(k).  Define  sub-sets  f2max(fc),  Xl^inik), 
Aimax(*)  =  ^2/nmax(fc),  and  =  n/f2min(fc)  aS 

follows: 

(30) 

f2„,ax(fc)  =  {i*  e  12  :  Ci*l(fc)  =  Cmax(fc)} 

-yj(k) 

and 

12min(fc)  =  {L  e  12  :  Ci.l(fc)  =  Cmin(fc)} 

Thus,  the  proof  of  the  convergence  Cii(k)  is  equivalent  to 

(31) 


prove  for  any  k,  there  exists  a  constant  6{k)  >  0,  such  that 

lldax(^  +  ^)-din(^  +  ^)ll  <  A||CLxW-di„(fc)ll,  (38) 


we  have 


Zii(k  +  1)  —  Zii{k)  +  Gij(k)[Qi{k)  —  Cii(fc)] 

i=i 

/*^0  +  (^  +  l)^s 


'  tQ  kT s 


i  it)dt, 


(32) 


Zi2{k  +  1)  =  Zi2{k)  +  ^  G,j{k)[zj2{k)  -  Zi2{k)]i33) 

i=i 

n 

Zi3{k  +  1)  =  Zis{k)  +  ^ij{k)[Cj3{k)  — 

p't'Q-\-{k-\-l)T's 

+  /  yf{t)dt,  (34) 

tQ-\-hTg 


A 


A 


where  Oi(fc)  =  xi{k)  -  xf{k)  and  Ci3(fc)  =  yiik)  -  yf{k). 
It  then  follows  from  the  definitions  of  Cii{k)  and  Qsik)  and 


for  some  0  <  A  <  1. 

Noting  that  It  follows  from  the  sampling  time  Tg  <  Cj 
that  the  sensing/communication  sequence  {S{k)  =  S{to  + 
kTs),k  =  0, 1,---}  completely  captures  the  information 
of  sequence  {S{t^),ri  =  0, 1,---}.  Thus,  the  sequential 
completeness  of  {S'(f^),  77  =  0, 1,  •  •  •  }  implies  the  sequential 
completeness  of  {S{k),  fc  =  0, 1,  •  •  •  }. 

To  this  end,  the  inequality  (38)  can  be  established  by 
looking  into  the  evolution  of  Ci*i(fc)  for  every  i*  e  12max(fc) 
and  Ci,i{k)  for  every  2*  G  ilniin(k).  This  can  be  done  using 
a  similar  analysis  as  in  [29].  Details  omitted  due  to  space 
limitation.  □ 

Remark  3.2:  The  kinematic  model  (1)  was  considered  in 
the  proposed  formation  control.  With  the  aid  of  backstep- 
ping  design  [12],  the  result  can  be  extended  to  deal  with 
the  distributed  formation  control  for  nonholonomic  robots 
described  by  the  following  dynamic  models  [3]: 


from  (32)  and  (34)  that 

n 

C^l{k  +  l)=  C*1  (k)  +  51 G,, (k) [Qi{k)  -  C*1  (fc)] ,(35) 
i=i 

n 

Crsik  +  1)=  C*3(fc)  +  Y.G,j{k)[Q3{k)  -  C*3(fc)].(36) 

i=i 

To  this  end,  if  we  can  show  that  limfc_>oo  Cii(^)  =  ci, 
limfc^oo  Ci3(fc)  =  C3,  limfc^oo  2:i2(^)  =  C2,  for  all  i, 
where  ci,  02,03  are  some  constants,  then  it  is  obvious 
that  the  formation  control  objective  (6)  is  achieved  since 
Zii{k)  =  01+  xf{k)  and  limfe^oo+3(fc)  =  C3  + 

yf{k). 

It  suffices  to  show  the  convergence  of  Cii(fc)  in  (35).  Same 
argument  can  be  applied  to  Ctaik)  in  (36)  and  Zi2{k)  in  (33). 
It  follows  from  the  choice  of  Gij{k)  in  (23)  that 

n  n 

-  Gi(fc)]  =  -  Qi{k). 

j=i  j=i 

Thus,  (35)  becomes 

n 

C,i(A:  +  l)  =  5]G,,(fc)0i(fc)  (37) 

i=i 


M{q)q  +  Ciq,q)q  +  Giq)  =  B{q)T  +  j'^ {q)X  (39) 
J{q)q  =  0  (40) 

where  <?  =  [  <Zi  •  •  •  ^  €  3?”  is  the  generalized 

coordinates,  M{q)  G  3?”^"  is  a  bounded  positive-definite 
symmetric  inertia  matrix,  G(q,q)  G  3?"^"  is  the  centripetal 
and  Coriolis  matrix,  G(q)  G  3?"  is  the  gravitation  force 
vector,  B(q)  G  3?"^'’  is  the  input  transformation  matrix, 
T  €  3?'’  is  the  input  vector  of  forces  and  torques,  J(q)  G 

^(n-m)xu 

is  the  matrix  associated  with  the  constraints,  and 
A  G  3?”“’”  is  the  vector  of  constraint  forces  on  the  contact 
point  between  the  rigid  body  and  the  surface.  o 

IV.  Simulation 

In  this  section,  we  simulate  the  proposed  formation  control 
by  considering  three  mobile  robots  moving  according  to  a 
circular  contour  while  maintaining  a  right  triangle  formation. 
We  assume  that  the  sensor/communication  topologies  are 
changing  according  to  the  sequence  {S(tk),k  G  H}  defined 
in  (8),  and  robot  1  can  receive  the  desired  trajectory  contour 
information  qo(t). 

Let  qo(t)  be  [sin(0.2f),  —  cos(0.2f)]^.  The  corresponding 
moving  frame  is  given  by  ei(f)  =  [cos(0.2f),  —  sin(0.2f)]^, 
62(2)  =  [sin(0.2f), cos(0.2f)]^.  The  formation  parameters 
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are  given  by  an  =  0,  a^  =  0,  021  =  -l,a22  =  1,031  = 
-1,032  =  -1.  The  initial  conditions  [xi{to),yiito),Oi{to)] 
are  given  by  [0.1, 0.2, 7r/4],  [1,  — 2, 7r/6],  [— 1,  — 1.5, for 

1  =  1,2,3,  respectively,  =  0.2  and  Tg  =  0.1.  Figure 

2  illustrates  the  phase  portrait  under  the  proposed  controls 
proposed  controls  (18)  and  (19). 


Fig.  2.  Phase  portrait  of  three  robots 


V.  Conclusion 

In  this  paper,  we  proposed  a  new  nonlinear  formation 
control  method  for  nonholonomic  mobile  robots.  Formation 
patterns  are  defined  based  on  local  coordinates  with  respect 
to  the  moving  frame  attached  to  the  desired  trajectory  contour 
for  the  group.  Through  the  design  of  a  finite-time  distributed 
observer  for  the  desired  trajectory,  the  formation  control 
design  can  be  done  with  limited  information  for  the  desired 
trajectory.  System  stability  was  rigorously  proved  using  a 
contraction  mapping  method.  Simulation  result  validated  the 
effectiveness  of  the  proposed  design. 
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